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PREFACE. 



This book is based on lectures given at the London 
School of Economics and Political Science in the five 
years following its foundation in 1895. There seems to 
be no text-book in English dealing directly and com- 
pletely with the common methods of statistics. English 
writings on the various branches of the science are for 
the most part in the form of articles in the journals of 
learned societies. Professor Mayo Smith in his Statistics 
and Sociology proceeds almost at once to historical 
applications ; while in Professor Meitzen's Geschichte, 
TAeorie, und Technik der Statistik, issued in English 
by the American Academy of Political and Social 
Science, so much space is devoted to the history of 
the development of statistics, and the book is so slight, 
in comparison with the wide field it covers, that many 
elementary methods are treated very cursorily. In the 
excellent books in French, German, and Italian on this 
subject there is a general tendency to deal at length 
with the history of official statistics, the limits of the 
science, and particular applications of the theory of pro- 
bability, to the exclusion of more general matter; so that 
a student must refer to the works of Dr Mayr, Professor 
Westergaard, Professor Lexis, Professor Gabaglio, M. 
Block, and Dr Bertillon before he is completely ac- 



quainted with the elementary methods of statistics. The 

result is. that there is no compact statement of principles 

acknowledged by statisticians, of the methods common to 

most branches of statistical work, of the artifices developed 

for handling and simplifying the raw material, and of the 

mathematical theorems by the use of which the results of 

investigations may be interpreted. This book forms an 

attempt to supply this want, so far as can be done without 

undue length. No place has been given in it to the 

history of statistics, and it does not contain any summary 

of the main groups of statistics extant ; several tables, 

drawn from a wide range of subjects, are given, but only 

to illustrate particular methods, and their choice has been 

determined by their suitability for this purpose. In the 

chapter on Collection of Material some account is offered 

of the genesis of the most important English statistics : 

the great part of the figures tabulated in the Statistical 

Abstract can be traced back to the householder's schedule 

of the Population Census or the custom house returns of 

foreign trade, while the chief statistics accessible for the 

study' of modern social questions have come from the 

Wage Census of 1886 or are collected by the Labour 

Department : it is hoped that the account of these four 

groups of figures will afford some help in judging of 

"mitations. Considerable space has 

ubjects of Averages and Diagrams, 

miversal, and, while their principles 

nple, their application is often mis- 

apter on Accuracy is based on the 

of 1897, and may perhaps be found 

that is new, a claim which is not 

part of the book. The treatment 
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throughout is intended to be suitable for those whose 
mathematics have not been carried to any height or have 
become rusty from disuse. With this view, when mathe- 
matical symbols were unavoidable, the preliminary 
hypotheses have been first discussed without algebraic 
notation and at some length, and those proofs have been 
chosen which require the minimum mathematical know- 
ledge rather than those which lead most directly to the 
result. Thus the most important results of the Theory 
of Error have been obtained without the use of the Differ- 
ential or Integral Calculus, and it is hoped that the greater 
part even of the chapter on Correlation will be intelligible 
to those who are not so well equipped as the Major- 
General in the Pirates of Penzance. Part II. is in- 
tended to be introductory and is certainly incomplete ; the 
normal law of frequency is the only one discussed, and 
the correlation of three variables is untouched. The more 
advanced treatment of this part of the subject is likely 
to be of interest to but few, who will have little trouble 
in obtaining the books and journals in which the further 
development may be found. Short bibliographies are 
added to the chapter on Interpolation and to Part II. for 
this purpose. It is hoped that this elementary handling 
may be of use to some who are interested in the statistical 
arguments based on the Laws of Probability, and that 
the definitions, formulae, and proofs given may save others 
from the necessity of searching in books, long out of 
print, for elementary theorems and deductions. The 
treatment in Part II., Section II., is peculiar in that it 
leaves very much in the background the Method of Least 
Squares ; the phrase, useful in some connections, seems 
to make the application of the Law of Error to statistics 
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unnecessarily complex. I am much indebted to Professor 
Edgeworth, who has not only given me continued help 
both privately and by his publications in the study of the 
mathematical treatment of statistics, but has also read 
Part II. in proof and suggested many useful and important 
alterations. My thanks are also due to Professor Everett 
and Mr W. F. Sheppard for help in the chapter on 
Interpolation, and to Mr C. P. Sanger and Mr H. Clissold 
for reading great parts of the book in proof. 

A. L. B. 



London School of Economics, 
Tanuary 1901. 
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CHAPTER I. 

SCOPE AND MEANING OF STATISTICS. 

Very many definitions have been given of the word statistics^ 

and each author who has written on the subject has assigned new 

D«ii]iittoiii of limits to the field which should be included in its 

itotiitios. scope. It will not be necessary for the purpose of 
this book to discuss the merely verbal differences involved, but 
only to explain what is intended by its title, and to consider 
the limits of the science which it is proposed to investigate. It 
will be useful, however, to mention some possible definitions. 

Statistics may, for instance, be called the science of counting. 

Counting appears at first sight to be a very simple operation, 

The loioBoe of which any one can perform or which can be done 

ooanUBg. automatically ; but, as a matter of fact, when we 
come to large numbers, e.g,^ the population of the United King- 
dom, counting is by no means easy, or within the power of an 
individual ; limits of time and place alone prevent it being so 
carried out, and in no way can absolute accuracy be obtained 
when the numbers surpass certain limits. Great numbers are 
not counted correctly to a unit, they are estimated ; and we might 

Diiunotion perhaps point to this as a division between arith- 
botwoMttatiauos metic and statistics, that whereas arithmetic attains 
andaiittuiMtio. exactness, statistics deals with estimates, some- 
times very accurate,. and often sufficiently so for their purpose, 
but never mathematically exact. Statistics generally relate to 
numbers so great that their estimation is beyond the power of 

Btotiitict ^" individual, and requires the co-operation of an 
M oo-opentiYe organised body of workers. Though the collec- 

***"'*'^" tion of numbers by several persons and the mere 
addition of the results seem simply questions of arithmetic, yet 
in practice two difficulties soon occur. First, it is not easy to ^ 
define the thing to be counted so explicitly that all the tellers 
shall admit and reject instances on the same principles ; for such 
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simple objects as the number of rooms or'stories of a house, a 
person's age, even an individual, give rise to such complex ques- 
tions of definition that it is often impossible to tell from a short 
description of a category exactly what items are included in it. 
Secondly, numerical errors cannot be avoided when many 
workers are involved ; for some among a lai^e number of 
persons will be inaccurate, some unintelligent, some will not 
obtain complete information, and when their reports are com- 
piled there will be occasional mistakes in copying and errors in 
tabulation. A total which is the result of the work of many 
hands will certainly from one cause or another fall short of 
complete accuracy. But though all estimates of this nature are 
sometimes included under the term slatistics, this definition at 
once is too wide, and also does not bring out the distinctive 
nature of statistical method. 

It is better, in fact, to define statistics a posteriori. In dealing 
with masses of figures, large numbers descriptive of groups, series 
suurtiMMA °^ totals or averages relating to different dates or 
mtUMd. places, it is found that special methods become 
necessary — methods which depend on particular properties of 
large numbers, methods which are suitable for describing com- 
plex groups so that they can be easily comprehended, methods 
for analysing the accuracy of statements, for measuring the 
significance of difllcrences, for comparing one estimate with 
another. Those estimates to which these methods apply are 
within the scope of statistics ; it is the study of these methods 
that is the object of this book. It is clear that, under our 
tentative definition, statistics is not merely a branch of political 
CMiur&utToi economy, nor is it confined to any one science. A 
knowledge of statistics is like a knowledge of 
foreign languages or of algebra : it may prove of 
" time under any circumstances. 
■ be interesting to trace the connection of statistical 
Mth various branches of knowledge. To begin 
« with the physical sciences : there are two points 
'*■■ in which this method touches astronomy. The 
r least squares was introduced by an astronomer, 
choose the best of several slightly discrepant observa- 
le position of a star. In most physical observations 
asurements are taken of the same quantity, and it is 
however carefully they are made, they never absolutely 



SCOPE AND MEANING OF STATISTICS. 5 

agree ; just as the averages obtained by different statisticians 
from the same series of sociological observations are generally 
not identical. From such a group of measurements it is neces- 
sary to deduce the most probable estimates ; this is done by the 
application of the law of error, known as the method of least 
squares. 

The other point of resemblance of statistical to astronomical 
method is common also to geology and to most applied sciences. 
ProgTMsive The course of scientific measurement has generally 
•**'*'*^- been to take first a rough observation of a quantity, 
such as the distance of the sun, the thickness of a stratum, the 
atomic weight of an element, the specific gravity of a substance ; 
then, as information accumulated, as the precision of instruments 
increased and methods were better adapted, to make the measure- 
ment gradually more and more accurate. It is important 
to appreciate this development, for in the present state of our 
knowledge, many statistical measurements cannot be made with 
precision for want of data, and a critic is inclined to say that for 
this reason preliminary estimates are valueless ; but from the 
scientific point of view this criticism is wrong, for a faulty 
measurement made on logical principles is betted than none, 
and may lead to others with progressive improvement. 

Passing by the general resemblance of statistical investigations 
to all scientific experiments, we may notice the use of statistics 
sutifUotand in biology. It was, perhaps, not recognised before 
biology. tj^e publication of Professor Karl Pearson's inves- 
tigations,* that the whole doctrine of evolution and heredity 
rests in reality on a statistical basis. It is in this direction that 
the most important new work in statistics is being done. It may 
be worth while to sketch very briefly the nature of the problem. 
Out of a great number of observations, say the measurements of 
the heights of a group of men, the type is found — the average, 
about which all the measurements are grouped according to some 
definite law. The problem is then to determine whether this 
type or the grouping about it changes, and in what way. The 
differences found in successive generations form the data on 
which arguments as to evolution and development are founded. 
The method applies equally to fossil remains, to zoological 
species, and to many other groups. If it is neglected, many 

• See 7'he Grammar of Science^ chap. x. seq,^ and the references there given. 



6 ELEMENTS OF STATISTICS. 

valid arguments lose a great part of their force, and theories 
are founded on personal impressions of phenomena instead 
of on scientific measurement The work done in this 
direction becomes of immediate use to the student of social 
questions. The average wage and the grouping about it 
and the change in these quantities present precisely similar 
problems ; the change in the purchasing power of money is 
calculated by the same mathematical formulae ; in fact, these 
methods furnish the only accurate way of measuring numerical 
changes in complex groups. Much valuable information has 
been collected in anthropometrical laboratories, which has in- 
creased the statistician's knowledge of facts and given birth to 
important theoretical principles. 

Meteorology has much in common with statistics. The chief 

measurements taken for the purposes of this science are of 

stfttutios and temperature, barometrical pressure, moisture of the 

meteorology, ^j,.^ ^^^ force of the wind. One of the problems 

attacked is again that of finding the type from a group of 
observations, and of measuring its change. The tables which 
state the average temperature year by year are in many ways 
similar to tHbse which the Registrar-General publishes of births, 
deaths, and marriages. Without the aid of statistical method, 
the averages obtained show mere numbers from which no logical 
deductions can be made. With the help of this knowledge, it 
can be seen whether the change from year to year is significant 
or accidental ; whether the figures show a progressive or periodic 
change ; whether they obey any law or not. The problem is 
easily seen to be of importance for forecasting the future 
population and for many similar purposes. 

We are thus brought by a short step to the province to which 
statistics has sometimes been confined : the study of demography. 
sutiBtiosand If in demography we include, not merely the 
demography, measurement of the numbers of the population, 
the birth, marriage, and death rates, the distribution by age, by 
sex, and by locality, in fact, the figures which naturally come 
from the census and the Registrar-General's returns ; but include 
also, industrial and social measurements, of distribution of the 
population by trade, of income, wages, production, foreign trade 
transportation, and so forth ; we have extended the limits of 
demography till it includes the majority of the statistical 
investigations directly interesting to students of sociology 
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or of political economy. Without stopping to decide the 
exact limits of demography, we can quickly pass to another 
definition of statistics (so far as it concerns such students) on 
which it is wished to lay a certain stress : statistics is the science 
of the measurement of the social organism^ regarded as a whole ^ in 

all its manifestations. In a monograph, after the 
totheiooiai fashion of Le Play, a single family is studied ; the 
<wB»»J^*»* occupations and earnings of its members, the way 

these earnings are spent, and its economic position 
generally are set down ; but this study is not so far statistical. 
In demography we study the same quantities when groups of 
families are concerned ; the number of families engaged in certain 
industries, and their average receipts, expenditure, and savings ; 
here we have statistics. In the monographic method the indi- 
vidual is everything ; in the statistical method, nothing. When 
we wish to obtain a measurement of the group, peculiarities 
of individuals receive no attention ; it is only when the same 
peculiarities are possessed by many persons that they become of 
importance. Statistics may rightly be called the science of 
averages. In the measurement of a complex group, say of 
incomes and wages, the exceptional artiste who can earn £\oo 
in an evening, and the inefficient labourer who can only make 
sixpence a day, affect only slightly the general average ; they 
are not entered in separate categories ; but the large group of 
skilled artisans who can earn over forty shillings a week, or of 
casual labourers who make less than fifteen shillings, are entitled 
to separate notice. The exact specification to be adopted is only 
a question of degree, which differs with the nature of the par- 
ticular investigation in hand. The object of a statistical estimate 
of a complex group is to present an outline, to enable the mind 
to comprehend with a single effort the significance of the whole. 
To do this it is necessary to exclude rigorously any presentation 
of details, for the same reason that, in a painter's rendering of a 
tree, the individual leaves are not distinguished. The outline 
will be a little blurred, a little inaccurate ; but it will be as 
distinct and detailed as the mind has power to grasp it, or the 
eye to see it ; the impression will be rightly given. There is a 
very important principle involved in this method. The individual 
members of a group vary continually, the whole group varies 
very slowly. It is impossible to follow or measure the motions 
of separate atoms ; it is comparatively easy to state the laws of 
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motion for a solid body. Great numbers and the averages 
resulting from them, such as we always obtain in measuring 
social phenomena, have great inertia. The total population, the 
total income, the birth and death rates, average wages, change 
very little; similar quantities relating to a single family change 
very fast. It is this constancy of great numbers that makes 
statistical measurement possible. It is to great numbers that 
statistical measurement chiefly applies. 

The relation of statistics to political economy is a simple 
one. Professor Marshall says,* " Statistics are the straw out of 
suuiuotBod *'''^'' 'i 'i'^^ every other economist, have to make 
poutisti the bricks," The statistician furnishes the political 
•'^^' economist with the facts, by which he tests his 
theories or on which he bases them. Since the economist deals 
chiefly with phenomena relating to groups, and regards the 
individual only as a member of a group, it is to statistics as 
' the science of averages that he looks for his information. When 
he is dealing with national economy, with the volume of trade, 
for instance, or the purchasing power of money, he is limited to 
pure theory, till statistics as the science of great numbers has 
provided the facts. The chemist experimenting in his laboratory 
is like the statistician ; the chemist theorising in his study is like 
the economist. Because of this relation it may be held to be the 
business of the statistician to collect, arrange, and describe, like 
a careful experimentist, but to draw no deductions ; even in an 
investigation relating to cause and effect, to present evidence but 
- not conclusions. As a distinct operation, of course, the statistician 
may assume the rote of the economist, for the same man may well 
be fitted to conduct the experiment and iit the theory. And just 
as a theoretical chemist will have little or no power unless he 
fully appreciates experimental methods and difficulties, even if 
he has not the manual dexterity to conduct them to perfection 
himself, so no student of political economy can pretend to com- 
plete equipment unless he is master of the methods of statistics, 
s difficulties, can see where accurate figures are possible, 
;ise the statistical evidence, and has an almost instinctive 
jn of the reliance that he may place on the estimates 

Tl. 

proper function, indeed, of statistics is to enlarge indi- 
* Evidence to the Committee on Ihe Census, 1890. 
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vidual experience. An individual is limited to what he can 
sutistioftMrMt hin^self see, a very small part of one division of 

indiTidnai the social organism ; his knowledge is extended in 
various ways, by the conversation of his acquaint- 
ance, by newspaper reports, by the writings of experts. Accord- 
ing to his ability and power of judgment, he will be able to form 
a correct view of the numerical importance of groups of persons 
and things ; but it is in the highest degree improbable that he 
will not have been biassed by the peculiarities of his position, 
and that he will place his different items of information in the 
right perspective ; and he will not be able to gauge rightly the 
accuracy of his data. As soon as he begins to examine these 
points he is undertaking a statistical investigation, and wiH very 
soon find himself involved in all the difficulties and problems 
from which a knowledge of statistical method alone can dis- 
entangle him. This is the obvious answer to those who deny 
the use of statistics. A statistical estimate may be good or bad, 
accurate or the reverse ; but in almost all cases it is likely to be 
more accurate than a casual observer's impression, and in the 
nature of things can only be disproved by statistical methods. 

A chief practical use of statistics is to show relative import- 
ance, the very thing which an individual is likely to misjudge. 
sutiBtios are Statistics are almost always comparative. The ab- 
oomparatiTa. solute magnitude of a quantity is of little meaning 
to us till we have some similar quantity with which to compare 
it. A statement of the number of paupers in the United King- 
dom is valueless unless we know the total population. A state- 
ment of the number of gallons of water supplied per head to the 
people of East London is of little meaning to us till we know 
the quantity supplied to other towns. The average wage, shown 
in the Wage Census, does not convey its full significance till we 
have similar computations for other countries or relating to other 
years. In the case of most statistical estimates, it will be found 
that we need another for comparison before we can appreciate 
the meaning of the first. 

If the group of objects which we wish to measure is large, 

its enumeration will be beyond our unassisted efforts, or those of 

oflieui any organisation at our command. Some investiga- 

■*•**■***• tions, indeed, have been successfully conducted by 
private organisations, for instance, those which resulted in Mr 
Booth's Ltfe and Labour of the People^ and Leone Levi's Wages and 
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Earnings ; but in general the measurement of a part of the social 
body or industrial oi^anism must be undertaken by the central or 
local governments, if it is to be successfully carried out. The fact 
that this is the case explains the hetert^eneousness and imper- 
fection of the mass of statistics extant. A government naturally 
collects numerical information only in relation to its own func- 
tions. Thus the administration must know the numbers of the 
population and the area of the country in gross and in detail for 
its own purposes. Lai^e groups of figures come simply from the 
necessity of public account-keeping. Many official figures are 
bye-products ; for office purposes an account is kept of all 
transactions in which the government has a hand, and of in- 
dustries subject to special r^ulations ; and the government 
publishes most of the figures which thus come in its way. To 
such causes are due our knowledge of the statistics of income, 
education, imports, railways, mines, factories, and so on. Some 
such publications are only survivals from a former time, when 
the figures were directly needed, such as Gazette wheat prices 
(used for the calculation of tithes), and, to some extent, statistics 
of exports. Though few figures are collected simply for scientific 
purposes, yet in many cases schedules issued for administrative 
ends are used at the same time for the reception of other 
information, of use chiefly to the sociological student ; much of 
the Census information comes under this heading. A view of 
those figures, relating to the United Kingdom, which are easily 
accessible to the student, can be obtained by turning through 
the annual Statistkai Abstract for the United Kingdom, the 
Annual Abstract of Labour Statistics, and the Registrar -GeneraCs 
Annual Report ; in one or other of these, summaries of, and 
references to, most official statistics are to be found. 

It is clear that figures collected simply in connection with 
trative purposes are not likely to be precisely those 
II which are needed by the student of sociology or 
"""■ political economy. Even where the wants of the 
and the student are nearly identical, the classification and 
on may not meet scientific requirements. There has, 
been considerable progress in recent years, due one may 
: to the influence of Sir R. Giffen, in the direction of 
ig statistical information not absolutely needed by the 
itration, and most of the work of the Labour Department 
s kind; but very much more might reasonably be done, 
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at an expense which would be almost negligible when considered 
in relation to the national income. Thus the census might be 
made, in part at least, quinquennial, and the body of workers, 
who are organized once in ten years to conduct it, only to be 
disbanded when the report is issued, might be made permanent 
and entrusted with the organization of a decennial industrial 
census. Market prices of many staple commodities could be 
tabulated by local officials in the same way as wheat sales 
are now registered. Movements of goods by rail could be 
tabulated in the same way as transport by water, and the 
anomaly that we know more of our foreign than of our home 
trade be removed. The production of factories might be re- 
turned as well as that of mines. A permanent government 
office might well be charged from time to time with special 
investigations, similar perhaps to the Wage Census of the Board 
of Trade. It needs very little study of statistics or of political 
economy, to feel the pressing need of some of this information. 
sutsitiQiipooi- Attention may be drawn to some of the gaps in our 
aUjoMdod. knowledge. When dealing with our national income 
we can obtain statistics of wages, and of income subject to tax ; 
but for salaries below the exemption limit, and for part of the 
income received for foreign investments, we are forced to rely 
on educated guesses. For the change of the purchasing power 
of money we know, thanks chiefly to the Economist and trade 
newspapers, the course of wholesale prices, but many interesting 
calculations are brought to a standstill because of the complete 
dearth of records of retail prices. With regard to wages, we can 
estimate fairly accurately standard and average wages, but, in 
default of an industrial census, do not know how many persons 
are in receipt of each given wage, nor the relative numbers of 
masters and men. We know fairly well the mass of trade that 
leaves or reaches our shores, but as regards the far greater mass 
of our internal trade our ignorance is almost complete. Till 
there is a public demand for such information, it will need a ver>' 
enlightened government to spare the time, trouble, and expense 
necessary for a systematic attempt to fill up these gaps ; but 
we can all do something towards this enlightenment, and in 
furtherance of this demand, by studying what has been done 
in other countries, and building up a knowledge of the science 
of statistical investigation. 

The absence of such a demand is perhaps due to a widely 
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spread and not unreasonable distrust of statistical estimates, 
tmtnutor crystallized in the common remark that "anything 
■ututioi: (.j,[j be proved by statistics." This is to a great 
' ' extent the fault of the criticising public themselves : they are 
always requiring and the newspapers always supplying informa- 
tion, which depends on a statistical basis, but for which good 
statistics are not to be found for one or other of the reasons 
already indicated. The informant must perforce 
turn to inaccurate estimates, and the public has no 
knowledge or discrimination as to what estimates rest on satis- 
factory data, or indeed as to what quantities are capable of 
statistical evaluation. Again, figures which cover only part of 
the subject, such as the Wage Census average, or the Labour 
Gazette returns of unemployed, may be quoted as universal ; mere 
estimates, made for quite other purposes, may be given as 
accurate and complete; and on such unreliable premises argu- 
ments are based, which naturally, by a judicious choice of 
material, can be made to support any theory at pleasure. It 
will generally be found that the statistician, on whose authority 
such statements are supposed to be based, is not to blame. 
Some of the common ways of producing a false statistical 
ai^ument are to quote figures without their context, omitting 
the cautions as to their incompleteness, or to apply them to a 
group of phenomena quite different to that to which they in 
reality relate ; to take estimates referring to only part of a 
group as complete ; to enumerate the events favourable to an 
argument, omitting the other side; and to argue hastily from 
4 effect to cause, this last error being the one most often fathered 
on to statistics. For all these elementary mistakes in \og\c, 
statistics is held responsible. 

Perhaps statisticians themselves have not always fully recog- 
nised the limitations of their work. At best they can measure 
umiutioni of only the numerical aspect of a phenomenon ; while 
■utiitkH very often they must be content with measuring 
cts they wish, but some allied quantity- We wish to 
instance, the extent of poverty, its increase or diminu- 
rty we cannot define or measure, and we cannot even 
number of the poor; all we can do is to state the 
" officially recognised paupers, and add perhaps some 
Trom private sources ; but this gives us no clue to the 
f poverty in individual cases. Or we wish to obtain 
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Statistics of health : all we can measure is the death-rate and 
average length of life, very different matters. The statistician's 
contribution to a sociological problem is only one of objective / 
measurement, and this is frequently among the less important! 
of the data ; it is as necessary, however, to its solution as 
accurate measurements are for the construction of a building. 
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THE GENERAL METHOD OF STATISTICAL 

INVESTIGA TION. 

At first sight it will seem as if there were no method common 
to all statistical investigations, and indeed the processes differ 
so widely that it is not easy to outline a scheme which will 
include them all ; but the following sequence is generally 
indicated* as of general application, and will serve at least 
to thread an examination of methods together : (i) the Collection 
of Material, (2) its Tabulation, (3) the Summary, and (4) a 
Critical Examination of its results. These processes will be 
discussed in detail in the following chapters. 

It may be well to state what equipment is necessary for the 
student who wishes to learn statistical methods. In collection 

and tabulation common-sense is the chief requisite, 
kaowiodse ^ind experience the chief teacher ; no more than 
neoatiaiyor a knowledge of the simplest arithmetic is neces- 

sary for the actual processes ; but since, as we shall 
see immediately, all the parts of an investigation are inter- 
dependent, it is expedient to understand the whole before 
attempting to carry out a part. For summarising, it is well to 
have acquaintance with the various algebraic averages, and 
with enough geometry for the interpretation of simple curves, 
though all the operations can be performed without the use of 
algebraic symbols. For criticism of estimates and interpre- 
tation of results, it is necessary to use the formulae of more 
advanced mathematics, and it is obviously expedient to under- 
stand the methods by which these formulae are obtained to 
ensure their intelligent use. They are specially necessary for 
the comparison of complex groups, and for estimating the 
significance of a divergence from the average, or the deviations 
in a list of periodic figures. 

♦ See, e,g,i Dr Bertillon's Cours iUmentaire de Statistique^ to which the 
present author is indebted for some of the treatment in the following pages. 

B 
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(i.) Information is generallycollected byissuing blank circulars, 
forms of inquiry, to be filled in either by a few officials or by many 
ooueotion: individuals, and the proper drawing up of this 
bunkfonu; fQj.jjj jg one of the chief tasks in a good investiga- 
^ tion. Before this form is issued it is necessary to formulate 
\ a complete scheme of the whole undertaking, and even to have 
some idea of what the resulting figures will be, so as to be 
able to arrange the details of the organization on the right 
scale, and adjust the tools used to their purpose. As already 
pointed out, the object whose measurement is wanted is not in 
general exactly that which can be measured, and the measur- 
able quantity nearest to it must be found ; e,g,, when the average 
annual earnings of the working class were in question, the 
quantity first measured was the average weekly wage. Then 
some technical knowledge of the particular subject is needed ; 
and, if ilot possessed, a preliminary inquiry on a small 
scale may be necessary to show how to fit means to ends. 
The people who possess the information required must be 
discovered and interrogated at first hand. The questions put 
must be those which will yield answers in a form ready for 
natnnoftbe tabulation, and the scheme of tabulation must 
quMtioni. therefore be thought out beforehand. The ques- 
tions must be so clear that a misunderstanding is impossible, 
and so framed that the answers will be perfectly definite, 
a simple number, or " yes " ot " no." They must be such as 
cannot give offence, or appear inquisitorial, or lead to partisan 
answers, or suppression of part of the facts. The mean must 
be found between asking more than will be readily answered 
and less than is wanted for the purpose in hand. The form 
must contain necessary instructions, making mistakes difficult, 
but must not be too complex. The exact degree of accuracy 
required, whether the answers are to be correct to shillings or 
pence, to months or days, must be decided. Every word and 
every square inch of space must be keenly criticised. A 
little trouble spent upon the form will save much inconvenience 
afterwards. 

(2.) In considering what method is to be adopted for tabula- 
tion, we must remember that the investigation is intended to 
. ^, *. furnish the answers to certain definite questions — 

TalralatloiL ^ 

how many people, what wage, what price^and each 
column must present some total which is relevant to these 
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questions. The exact scheme employed will diflFer in diflFerent 
inquiries. In the population census, the tabulation is almost 
automatic ; in the wage census, the best and simplest way to 
show the grouping about the average wage in each occupation 
had to be specially devised ; in trade statistics the number of 
different categories to be adopted and the limits of each raise 
difficult questions. In general, the scheme of investigation re- 
quires knowledge of certain groups ; and the totals resulting 
from tabulation should show the numbers of items in these, so 
that after tabulation, instead of the chaotic mass of infinitely 
varying items, we have a definite general outline of the whole 
group in question. 

(3.) When the raw material is worked up to this point, skill of 
a different kind is wanted. From the numbers obtained, we 
ATWAgmg and have to pick out the significant figures; so to 
■imu&Miiation. present the totals and averages as to give a 
true impression to an inquirer ; to summarise briefly the 
information obtained ; to concentrate the mass into a few 
significant averages, and to describe their exact meaning in - 
the fewest and clearest words, for it is the result of this 
concentration which will generally be used and quoted. To 
do this skilfully requires an acquaintance with the method of 
averages and the use of diagrams. It may further be necessary 
to fill in unavoidable gaps in the figures in order to supply esti- 
mates for intermediate years; this needs a study of the dangerous 
method of interpolation. Finally, the verbal description of the/ 
process, its genesis and results, and an estimate of its accuracy 
must be added, and then the investigation is complete. 

(4.) The student who has to make use of statistics should not 
be content to take the results of an inquiry on authority, but 

oritidimof ought to acquaint himself with all these details of 

'•'^*^ method. Before the results can be criticised, it^ 
is necessary to know the complete genesis of the figures ; ^ 
whether the whole field was covered ; exactly whence the 
information tabulated was obtained ; whether there was a 
possibility of bias; how nearly the individual answers were 
correct; whether the informants really knew the facts they 
related, and if they were likely to state them correctly. The 
published statement of the results should show clearly the 
whole scheme of collection so as to make this criticism possible ; 
in particular, specimens of the original blank forms should be ^ 
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included, so that the reader can judge whether the original 
answers correspond exactly to the form of tabulation employed. 
Internal evidence often leads to much useful criticism. It can 
be seen whether the number of returns for each group is 
proportional to its importance, or if a specially important 
figure depends on only slight evidence. The continuity of the 
figures can be examined, and the causes of sudden gaps in- 
vestigated. The returns can be divided into sample groups, 
and the extent of the correspondence of these groups to the 
general result will often indicate whether the returns are 
sufficiently general. A careful study of the more minute 
tabulations may show within what percentage the final numbers 
may be expected to be correct. 

The most important function of statistics is to produce 
evidence showing the relation of one group of phenomena to 
another ; for the information obtained is presumably intended 
as a guide for action, the guidance is generally needed to show 
what actions are likely to produce certain desired effects, and 
this is best investigated by finding how such effects have been 
produced in the past. We have then to determine whether 
changes in one measurable quantity {e^., the duties on corn) 
have produce*.! changes in another (r^., the amount of pauperism); 
n problem generally insoluble, but one on which most light 
can be obtained by the study of the relevant statistics in the 
light i>f niathcmatics, the mathematics of probability, and it is 
in this wirticular branch of mathematics that recent statistical 
pnigress has bcvn chiefly made. 

Such questions, however important, are somewhat abstruse, 
«; a certain amount of technical knowledge which 
jKWsossion of the general student. The plan of 
n jKviljwne all questions requiring such technical 
■al knowledije to the Second Part, and to confine 
icussions to problems needing no special training 
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Section i. — The Population Census. 

The population census will provide good illustrations of the 
principles laid down in the last chapter, both because we shall 

be at first on familiar erround, since every one knows 

its scheme, purpose, and details, and because the 

form of inquiry used for the collection of the original data 

brings out very prominently the difficulties met with in detailed 

statistical investigations. 

The first thing to be considered is the exact object for ^ 
which the census is undertaken. It is for demographical pur- 
poses ; to supply information as to the numbers and 
local distribution of the population, the numbers of 
each sex and age, their so-called civil condition (i>., whether 
single, married, or widowed), and their nationality. This is the 
minimum information necessary for administrative purposes. - 
In addition to these facts there are very many others which the 
statesman and the economist wish to know about each member 
of the population, and the census form is the only means in 
England of collecting universal data ; the question as to which 
of these shall be investigated and which neglected, is decided 
Tba ohoioe of more by expediency than on principle. Of these 
v^»^^^"^' desiderata the following may be mentioned : the 
size and structure of the family, its position in the social scale, 
the economic position of its head ; the nature of employment 
of its members, the wage or income of each member and of the 
family as a whole, the rent and size of their house, their educa- 
tional condition, the ages at which they commenced or retired 
from work, their migrations, their combination in religious or 
other bodies, and their infirmities. It is clear that some of 
this information must be dispensed with, if the form is not to 
be overcrowded, and if the tabulation is to be finished in any 
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reasonable time ; and an examination of the general nature 
of the questions which can suitably be put will show how the 
necessary selection is made. 

First, the questions must be those which the informant is 
able to answer. Now, if the questions were only to be put 
Abuity to educated and methodical persons, doubtless a 
to answer. {^\\ account could be given of the family migra- 
tions and of the ages at which each member had been at work ; 
but the peculiarity of the census is that it is universal, and 
the questions must be such that the least educated and most 
unthrifty householder shall be able to answer ; in many cases 
such facts would have been unrecorded and forgotten. 

Secondly, the questions must be perfectly definite, so that 
there can be no doubt as to what the right answer should be. 

The only answers which are of value to the 
statistician are "yes," "no," or a simple number. 
Adjectives and adverbs such as many, often, partly, &c., bear 
different numerical meanings to different people, and, though 
they may express fairly clearly the position of an individual, 
are nearly useless for tabulation,* which is their only purpose 
so far as the census is concerned. Thus the question as to 
education would have to be, not " state whether well, moderately, 
or badly educated," but " state at what age school was left," or 
"how many years at school?" But even if such questions 
were not excluded by our first test, by the forgetfulness of the 
informant, the statements given would be of little practical value, 
and very often incorrect. An inquiry as to wage and income 
could not be made sufficiently definite without so many questions 
as to require a form to itself ; for wages, as we shall see when 
considering the Wage Census, require very careful definition, and 
many subsidiary questions must be put to get a proper estimate ; 
the simple query, "what is your weekly wage or annual income?" 
would be answered on so many varying principles that the result 
would be valueless. 

Thirdly, the questions must be such as will be answered 
truthfully and without bias. There is hardly a demand on 

the census form which would not be excluded, if 

this rule was too rigorously enforced, as we shall 

see immediately. The worst offender in this respect is the 

♦ But see p. 138, ift/ra. 
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question, Employer or employed? For though there are many 
cases in which a man is both employer and employed so that 
this question should be excluded by our second test, many 
persons consciously exaggerate their social importance by 
erroneously replying thfe former. Questions relating to social 
position must generally be excluded by this rule. 

Fourthly, the questions must be those which will be answered 

willingly, and must therefore not be inquisitorial, or such as 

B^niotanoe to to raise apprehension of a change of law or an 

■'**^'^- imposition of taxes. Questions as to membership 
of trade unions, or of friendly societies, or as to insurance, 
would be thought inquisitorial. Many would refuse to state 
their incomes, holding it to be no one's concern but their own. 
Questions as to rent might be regarded as possibly leading 
to taxation. Questions as to religion are badly answered, as i 

was shown in the evidence before the Census Committee of j 

1890,* and should be excluded by each of these four rules. 
Some persons do not know what their religion should be 
named, others would find the question indefinite, others would 
deliberately answer wrongly, and many not at all. 

The questions on the census formf not excluded on one 
or other of these grounds are Nos. i, 2, 3, 4, 5, and 10; 
these are fairly definite, and householders are generally able 
and willing to give correct answers to them. Questions 6, 7, 
8, 9, and 11 compete with many others, which lead to equal 
inaccuracies, for a space on the census sheet. No. 6 has long 
held its place because of its great importance ; Nos. 7, 8, 
and 9 are on their trial. A further discussion of the merits 
of some of these is to be found in the Report of the Com- 
mittee already mentioned ; here it is only intended to indicate 
the general grounds of inclusion or exclusion. 

So far we have not discussed the important question as to 

who should fill in the form. If, as in the English Census, 

Fnuagapof it IS to be filled in by the householder, the ques- 

the foniL tions must be much simpler in matter and words 
than if it is to be filled in by an oflficial teller. In the latter 
case the form may be much more complicated, the questions 
more inquisitorial and such as might lead to indefinite answers 
on the part of ignorant people ; for the teller would insist on 

* Report of Committee on the Census^ 1890 (C. — 6071). t Facing p. 23. 
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an answer, be able to exclude those obviously wrong, and 
cross-question till the indefinite answers were so altered as to 
allow definite tabulation. In a great and complex undertaking 
like the Census, where many tellers must be impressed for a 
single day's work, their instructions and* the general plan must 
be sufficiently simple ; but as the extent of an inquiry con- 
tracts, the tellers can receive more complete instructions, and 
the information requisitioned may be more complex. This is 
of most importance in connection with columns 6-9. 

The general shape and appearance of the sheet needs 
attention. If the structure of the family is to be shown, the 
shap«oruaiik answers are best given on a single sheet, which 
•■^ must contain enough lines for the largest ordinary 
household, so that the trouble of fastening together of many 
couples may be avoided, and tabulation not be hindered. The 
spaces must contain plenty of room for answers in uneducated 
handwriting, without making the whole so large as not to lie 
easily on a desk. The instructions must be distinct and visible, 
and placed in close connection with the answers; to further 
this, a skilful use may be made of capitals, italics, and different 
founts of type. On the form facing p. 23, those in use are 
roughly reproduced in miniature. 

The form should always show for what purpose the figures 
are collected, and how they will be used, in order to enlist the 
pnpoM to to support of the informant and allay misapprehension. 
•*"«■ The extent to which this should be done depends 
a good deal on whether the filling-up is compulsory, as in 
the population census, or voluntary, as in the wage census. In 
the case before us no preamble is necessary, since every one 
knows the main features of a census, and most are willing to 
further its objects; but it must be shown that the inquiry is 
sanctioned by Parliament, and that compliance is compulsory. 
This is done on the back, on the fold which is outside before the 
form is opened; and even though penalties are threatened 
against absence of or falsification of returns, the last sen- 
tence describes the object of the inquiry and guarantees the 
informant against malicious use of his answers. Where in- 
formation is voluntarv% a careful letter should be printed and 
circulated with the form, persuading the informant to give his 

While the main part of the form is filled in by the house- 
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holder, other parts are filled in by the officials, and with very 
sntaidiary little trouble a good deal of subsidiary information 
infbrmauoiL ^an be collected in this way. On the outside the 
Parish, Town, Sanitary District, Street, and Number are endorsed, 
so that the answers can be tabulated for any of these districts. 
The teller could also, as he took the form, enter the number of 
stories to a house, which is not done in the English Census, and 
other information as to the style of house and street might be 
endorsed. In a more intensive investigation, Mr Charles Booth's 
assistants, for instance, could be trusted to come out of a house 
with an accurate knowledge of many interesting details. 

We can now proceed to the individual criticism of the form 
in the light of the rules suggested above. In the first place, 
Lines and even the arrangement of columns is not perfect. To 
ooinmns. labourers who are not in the habit of writing at all, 
and who have (to judge from election posters) to be instructed 
how to put their mark in the right place on a ballot paper 
(many papers being destroyed simply through ignorance), this 
arrangement of horizontal and vertical columns would be con- 
fusing, and without help they would not gather at all what they 
were to do. They would fill up more easily a paper in which 
the answers were to follow the questions immediately : — 

State your Name 

State your Age 

State your Sex ^ . 



Unmarried, Married, or Widowed 

and so on. 



This form, however, could only be used if a separate paper were 
to be filled in for every individual, children and all. Other 
elementary matters might be improved. On looking through the 
form a great number of words and phrases will be found which 
are not in common use, e^,, abode, dwelling (as a noun), else- 
where. East Indies, imbecile, "precise" infirmity, general term, 
column, the foregoing, condition as to marriage. In column i 
the phrase, " name and surname " reads as though surname were 
not a name, and perhaps the word " surname " is not in general 
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use, SO that the printed word might be taken to mean title, and 
the confusing answer " none " written under it. Does the in- 
struction " write after " mean to the right, or below ? 

The first question, which for the general purpose of the 

census should be the most definite of all, leaves some room for 

oritiotomofthe doubt. What of a night-watchman returning at 

quMtioiiB. 4 A 1^^ or a printer at 2 A.M. ? What constitutes a 
traveller : does a man who leaves the house before midnight, or 

"Slept or a man who goes down to Brighton by the theatre 

abode." train come under the term ? Is midnight or 2 A.M. 
the critical time ? What of a person who dies at i A.M., or a 
birth at midnight? How is the householder to know whether 
any of his establishment are returned elsewhere? Since too 
many instructions only lead to confusion, the tellers should be 
specially taught the answers to such questions. 

The very meaning of the phrase " population of a district " 
is open to much doubt. In France *Ma population de fait," 

Meamngof which consists of all present in the given district 
population. «t the given moment, is distinguished from "la 
population de droit," which consists of all usually resident in the 
district, including those temporarily absent, and excluding those 
only momentarily present, and from " la population municipale," 
which is "la population de droit," less prisoners, hospital patients, 
scholars resident in schools, members of convents, the army, and 
so on.* The English Census counts "la population de fait" 
In the United States we find a "constitutional population," 
which excludes residents in Indian Reservations, the Terri- 
tories, and the District of Columbia ; the " general population," 
which includes in addition the Territories (except the Indian 
Reservation, Indian Territory, and Alaska) ; and the " total popu- 
lation," which includes all excluded in the former.f In the 
future questions will arise as to the inclusion of the Philippines 
and Cuba. Notice that the Channel Islands and the Isle of 
Man are included in the English Census. 

♦ See Bertillon, ibid.^ p. 146. 

t Willcox : Area and Population of the United States at the XL 
Census^ a book which gives a very useful criticism of the accuracy of the 
most elementary data of statistics. It is a pity that space is wasted in a 
useless attempt to supplant the word " statistician," which has now a definite 
meaning, by the word " statist," which has another equally definite meaning. 
Does Dr Willcox wish to substitute "statics" for "statistics"? 



POPULATION CENSUS. 29 

It IS possible to find difficulties in filling up all the columns 
except No. 4. For illustration, consider how column 2 should 
be filled in in the case of a cousin who was a " paying guest," or 
a relation who was a visitor ; for column 3, is a divorced person 
single or a widower, and what of a woman who is doubtful 
whether her husband is lost at sea? Errors come from No. 3 
because many unmarried people call themselves married. 

It is well known that column 5 is wrongly filled in for two 
reasons — one, that elderly people often do not know their ages 

accurately and enter them to the nearest round 
number, so that the returns congregate at 40, 50, 
60: the error thus arising is eliminated by tabulation in the 
groups 35-45, 45-55 years, &c., and for more minute tabulation 
the groups 3-7, 8-12, 13-17, &c., are suggested : the other is that 
many ladies habitually enter their ages too low ; in this case 
also the Registrar-General is able to deduce nearly correct 
totals. 

It is to be noticed that, since the ages stated are those 
**last birthday," the age will on the average be given six 
months too low, and, in fact, the ages given as 17, e,g.^ should 
be scattered nearly uniformly over the months to the eighteenth 
year. . 

The most important criticisms of the census-schedule are to 
be made on columns 6-9. It will not be expedient here to go 

into all the questions raised before the Committee 

OoonpikiloiL 

on the Census as regards an industrial census. 
While there can be little doubt that a thorough census of occu- 
pations would be best undertaken separately, and on somewhat 
different principles from the population census, it is certainly 
better, till opinion is ripe for so radical a change, to include 
in the present census the best questions we can as to occupa- 
tions, than to omit them altogether in despair of accurate 
results. 

The objects aimed at, which we must always keep in mind 
when criticising special questions, are two : to find the number 
employed in each trade and industry, that is, so to say, to 
form vertical divisions ; and to find the number in each rank 
or grade of employment (labourer, artisan, employer, &c.) in 
horizontal divisions ; so that the tabulation may give some such 
result 
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Textile Industries. 








Cotton. 


Wool. 

• 


Linen. 


Totals. 


Employers 

Managers 

Overlookers - 

Spinners 

Weavers 

Labourers 

Children 










Totals 











The necessary minimum of information would be given by 
such answers as 

Legal — Solicitor — Managing clerk. 
Mining — Coal — Hewer. 
Metal-worker — Iron — Smith's striker. 

Now the simple instruction, " State your occupation," would of 
course not lead to information of this sort. The coal-hewer 
would simply say miner ; the clerk, managing clerk ; the striker, 
very likely smith. To explain what is wanted and avoid mis- 
takes, the question is not put on the face of the form at all, but 
the informant is referred to the back, half of which is devoted to 
instructions relating to this column. These are lucid, carefully 
picked out with capitals and italics, comprehensive, brief and to 
the point. No one who wishes to fill in the form rightly, and is 
sufficiently educated to understand simple instructions, can easily 
go wrong. Yet, as a matter of fact, these instructions are in very 
many cases neither read nor followed ; and this fact is very im- 
portant in connection with the general study of blank forms of 
inquiry. Forms issued to people uninterested in the object in view 
will generally be filled in with the least possible expenditure of 
time and intelligence. Hence two courses are open : to reduce 
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the question to the simplest possible form, and make the best of 
the result; or not to allow the informants to write in their own 
answers, but to take them vivd voce by means of a teller, who 
has mastered the instructions, and has the necessary legal force 
behind him to compel information. The latter course entails 
time and expense. 

The result of the present system of inquiry, combined with 
a faulty method of tabulation, which it to some extent makes 
necessary, is that we have no reliable census of occupations for 
the United Kingdom. The present figures break down both 
from faulty data and from insufficient tabulation directly we 
attempt to make any calculations depending on them. 

An attempt has been made to correct to some extent our 
ignorance of the relative numbers of unskilled and skilled 
Tbenraitoftiie labourers, employers and employed, by columns 
iMw qnmttoni. 7^ 3^ and 9. The headings are not a model of 
clearness ; there is not the ordinary imperative " state " or 
"write," nor is one told on the front of the form whether to 
write Yes or No or to make a mark in the appropriate column, 
nor is the distinction between the three headings a perfectly 
definite one ; but still one is hardly prepared for the following 
statement in the report : * — 

" In numerous instances, no cross at all was made ; in many 
others, crosses were made in two or even all three columns, and, 
even when only one cross was made, there were often very 
strong reasons for believing that it has been made in the wrong 
column. Oftentimes this use of the wrong column can scarcely 
have been other than intentional ; being dictated by the foolish 
but very common desire of persons to magnify the importance 
of their occupational condition. This desire must have led 
many subordinates to return themselves as employers rather 
than as employed, for it is only on this supposition that we can 
account for the otherwise unintelligible fact that, under several 
headings, there are actually, according to the returns, more 
employers than employed, more masters than men. . . . We 
hold [these returns] to be excessively untrustworthy, and shall 
make no use whatsoever of them in our remarks." 

This attempt and its result are of the greatest importance to 
all who try to draw up forms of inquiry. 

♦ General Report on the Census ^/iSqi, p. 36 (C. — 7222 of 1893). 
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Before leaving the subject, it should be mentioned in passing 
that we cannot deduce directly from our census the number of 
persons dependent on a particular trade for their living ; that is 
to say, the number of employers, their families (not otherwise 
returned) and domestic servants, and the number of employes 
and their dependent families. This, the most important total 
for estimating the relative importance of different trades of the 
country, is not tabulated, though such tabulation has been found 
possible in other countries, and we are dependent on the esti- 
mates of statisticians for such totals.* 

To see how the information given by the answers on the 
census schedule can be worked up into detailed specific numbers, 
it is only necessary to look at the diagram and table prefixed 
to each of the sections relating to special trades in Mr Booth's 
Life and Labour of the People {e.g,^ vol. v., p. 46). f 



* See Booth in Statistical Journal^ vol. xlix. + See p. 78, infra. 
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Section 2. — The Wage Census. 

The main differences between the wage census, taken in 
1886, and the general population census are— (i) That the 
filling up the forms in the wage census was voluntary ; 
(2) that their correct filling up required a higher degree of 
intelligence and education. As before, we must consider first 

the object which the wage census was intended 
to fulfil: it was to describe the earnings of the 
people of the United Kingdom, to compare the rates of wages 
trade by trade, and to find the * relative numbers earning 
at each rate. What is the best quantity to measure with this 
object in view? As a preliminary question should we take the 
Thevnit of day, week, or year as the unit of time ? Clearly we 
**■••• shall not be able to compute weekly wages if we 
only obtain daily, for the week's work varies from four to seven 
days in different occupations. The week's wage is a more 
definite quantity ; but the simple comparison of weekly wages 
in different trades will be deceptive, because most trades are 
busier at one season of the year than at another, and in many 
the difference between season and season is very great ; in any 
particular week, then, we may be comparing the best season of 
one industry with the worst of another. To avoid this error, 
and because we do not know how many full weeks' wages are 
obtained in a year, except in a few non-intermittent trades, it 
would seem best to take the year as unit ; but the direct cal- 
culation of an individual's annual earnings is practically impos- 
sible. The employer is not acquainted with this sum, for in 
large establishments the hands are continually changing, and 
one man will be paid by two or more masters in the same 
year ; and even in a factory with a nearly constant personnel, 
the weekly amounts paid to individuals are not in general so 
tabulated as to be easily summed, and the working out of the 
totals would require a prohibitive amount of clerical labour. If 
we turn to the workman, on the other hand, we shall find in the 
majority of cases that no accurate account has been kept of 
earnings through the year, and it would only be by careful 
individual examination, impracticable on any large scale, that 

c 
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an estimate could be made ; in many cases the men, even if 
willing, would be quite unable to give a connected account of 
their earnings during the past twelve months. 

It seems clear that we must adopt a smaller unit, and since 
most wages are paid weekly, a week is the most natural one. 
The subsidiary questions which will lead best to an estimate of 
annual earnings will be discussed below. The answer to the 
former question, as to the best quantity to investigate, is in- 
direct ; the only individual measurements we can obtain directly 
are the week's wages, but these may be supplemented by esti- 
mates en masse. 

Next, who possess the information we require? Clearly 
both employers and employed, and in an ideal census the 
Brnpioyen and ^"^wers would be obtained from both groups ; 
employed ai but considerations of simplicity, cheapness, and 
informAnte. accuracy are all in favour of applying to em- 
ployers alone. 

If employes were to be interrogate^ the procedure would be 
as follows. Draw up a form on the analogy of the census form, 
describe very briefly the purpose of inquiry, add a short series 
of concise, lucid, simple 'questions in suitable type and with 
careful spacing, such as will lead to the minimum information 
required ; let these forms be left to be called for, and when 
collected, let the tellers have time and opportunity to examine 
and correct them. It is clear that this method would entail an 
even more expensive organization than the population census, 
and as the result of experiment it may be doubted whether the 
maximum of accurate information that could be thus obtained 
would come up to the minimum that would be of use. A partial 
inquiry could, however, be carried out by means of trade 
unions if they were willing to give serious assistance. 

The method of inquiry among employers was as follows : — 
Suitable blank forms and an explanatory letter were sent by post 
to all employers, whose addresses could be found in the industries 
selected for investigation, and the answers were returned to the 
central office by post. This is far simpler and cheaper than the 
suggested scheme for inquiry among workmen, requiring far fewer 
forms and only a small staff of clerks. With business men it is a 
simpler matter to post the return when completed than to keep it 
for collection by hand. Since there is no personal intercourse over 
the matter it is especially necessary that the questions should be 
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lucid, for the additional correspondence necessary to rectify 
errors is a source of worry at both ends. A copy of one of these 
forms, abridged only in the number of subdivisions, is subjoined 
here and on the following page. . 



WAGE CENSUS. 

Return of the Rates of Wages Paid in Silk Manufactures. 

Name of Factory or Firm 

Address 



JVbte.— 'It is requested that the salaries of clerks and managers may be excluded. 

The return is of wages of working men only. 



Numbers employed on A 1886 - - No._ 

Amount paid in Wages in the year 1885 - - £ 

Highest weekly amount paid in 1885 j£ Date 

Number of Hands paid in that week - - No — 

Lowest weekly amount paid in 1885 j£ Date. 

Number of Hands paid in that week - - No — 



State the present average rate of pay for overtime : that is, whether 
overtime is reckoned as time and a quarter or time and 
a half, &c., or in what way reckoned . 

State whether overtime is at present being worked, and how much ; 
or whether less than full time, and how much less — 
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Current Rates of Wages and Hours of Labour per 

Week of Persons employed in each Branch of the Silk 
Manufactures, on 1886. 



Description of 
Occupation. 

N.B, — It is requested 
that this list of occu- 
pations may be re- 
vised where necessary. 



Silk Throwing' 
Parters 

Winders 

Cleaners 

Spinners 

Doublers 



i'Time 
Piece 
Time 
Piece 
Time 
Piece 
' Time 
Piece 
Time 
Piece 



&c. 

Silk Spinning — 

Openers and 
Sorters - 

Boilers 

Dressers 

Preparers and 
Carders - 
&c. 

Silk Weaving — 
Winders 

Warpers 



fTime 
Piece 
Time 
Piece 
Time 
Piece 
/Time 
\ Piece 



Time 
Piece 
Time 
Piece 



Warp Pickers /Time 
or Ciearers \ Piece 

Doublers - jj|™^ 

FUlers - j^J^^ 
&c 



Current Rates of Wages Paid and Number of 
Hours of Labour per Week when in full work, 
but exclusive of Overtime. 

Note, — State the Number of Hours of Labour per Week, 
whether the Workers were paid by Time or Piece- 
work, and if paid by Piece-work give the amount 
earned in a week, exclusive of Overtime. 



MALES. 



Mbn. 



•^1 



It 



O u 

§1 



Lads & Bovs. 






It 

p3^ 






FEMALES. 



WOMBN. 

iBjrearsand 
upwards. 



J 






S8 






Girls. 
Under 18 years. 



9 Ot 

W 



o rf 



2S 
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The measurement of the annual earnings of groups of 
workpeople was the ultimate object of the inquiry. Annual 

AimiiAi aaniiim. ^^^"'"S^ ^^^ composed of many different items, 

of which the following are the most important : — 
Ordinary weekly wages, pay for overtime, special payment 
for special work (^^., of builders if sent to a distance), or at 
special seasons (such as the harvest) ; and payments not in 
cash, such as free or reduced house-rent, free or cheap coal, 
and special goods at cheap or wholesale prices (such as cloth in 
textile factories, or potatoes for agricultural labourers). 

When payment in kind is at all general or important, it is 
generally better to proceed on a different method entirely, cg,^ 
that followed by the Agricultural Sub-Commissioners of the 
Labour Commission. When it consists of only one simple item, 
such as a house rent-free, it can form the subject of an additional 
question on a form similar to that on p. 35. In the silk industry 
this does not occur ; but this discussion shows the necessity of 
preliminary knowledge on the part of the investigator before 
the right form of inquiry can be drawn up. 

We have left for consideration the weekly wage, and over- 
time and special payments, the last two of which can be grouped 
together. The ordinary weekly wage is a sufficiently general 
and definable quantity in most subdivisions of most industries. 
A foreman could generally state how much is earned in an 
ordinary full week for each of the hands under him. In many 
cases there is an hourly or weekly sum regulated by a trade union, 
as in the building trades. In others, as in the cotton industry', 
piece-rates are so regulated as to bring out a definite sum 
for the week's work graduated in relation to the difficulty of the 
task ; in general, a very rapid survey of the wage-book will show 
what the worker in each subdivision will make on an average. 
Thus the average weekly wage in an ordinary full week can be 
found with considerable accuracy, but this takes us only part of 
the way in the calculation of annual earnings ; we need to know 
in addition to this how many full weeks are made in the year. 
It is the method by which this is attempted on the printed form 
that is open to most criticism. The questions used are on 
page 35, and afford a good example of the general difference 
between the qussita and the data which arc attainable. The 
qu£situm is : 'To how many full weeks' wage are the annual 
earnings equivalent, allowing for slack weeks and overtime? 



38 ELEMENTS OF STATISTICS. 

The first crucial question to decide is : Are we to allow for an 
average loss of time, say a week in the year, through sickness, or 

are we to allow only for time lost through failure of 
J ^^'"dSitL"^* work? Since sickness is an individual not a general 

misfortune, it will be better to exclude it if possible. 
Now overtime in one season, especially if its wages are on "time- 
and-a-quarter " or "time-and-a-half" basis, very quickly tends 
to balance slack time at another season, though it may be sup- 
posed that it is rarely the case that more than the normal week's 
wage is averaged through the year. Thus it will be logical as 
well as simple to estimate the year s earnings as so many normal 
weeks' wages. For example, if we found that two weeks were 
lost through sickness and three through the mill stopping, and 
that overtime in one busy month had added wages equivalent to 
two normal weeks, we should have forty-nine weeks' full wage. 
The figures which will give this result will be the total sum paid 
/ in wages in the factory in the year divided by the aggregate normal 
week's wage of the people dependent on the factory, supposed 
all at work. Thus, if 1,200 hands (men, women, and children) 
would, if all at work, make ;£^ 1,000 in a normal week, and this 
was the average number dependent on the particular mill, and if 
;^48,ooo was paid in the year in wages, annual earnings would 
be equivalent to forty-eight normal weeks, and earnings would 
average ;£^40. Now the total paid in wages is generally kept 
separate in business accounts, but the number dependent on the 
mill for work is often not known accurately ; for the personnel 
of a large establishment is subject to continual change, and the 
manager would not know whether a person who left went to 
another mill or got no work. The total number of all who had 
worked there during the year would be too great for this purpose, 
and the number at work in a normal week too small.' The 
number open, perhaps, to least objection is the number at work 
in the busiest week of the year ; for those absent except through 
sickness when trade is busy cannot be said to be dependent on 
the factory, but if not at work elsewhere are among the per- 
manent unemployed ; very few workpeople indeed will be 
taking their holiday at a busy time, and it may reasonably be 
supposed that all the factories in the same industry will have 
their busy and slack seasons at nearly the same time. The 
answers then to the printed questions — Total paid in year, and 
number of hands in busiest week — tell us all we need to know, 
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if we may make this assumption ; for then the total sum paid as 
wages in the year, divided by the maximum number employed 
in the busiest week, gives the average annual earnings. To find 
the equivalent number of normal weeks, multiply the maximum 
number employed by the average wage found on the second page 
of the form, so that the product shows the aggregate weekly 
wage if all were employed, and divide the total paid in the year 
by this product. 

In the Cotton industry the sum of the greatest numbers 

employed (if these may be taken as equivalent to the 

numbers employed in those weeks when the wage bill 

Loit time in tho was highest) was, in 1885, 87,887. ;f 3, 148,566 

cotton iBdiutiy. ^^s paid in wages in that year in the factories 

making returns. Average annual earnings were therefore 

b o^ =-^35- '6s. The average wage in a normal week in 
f 7,887 

1 886 was 15s. 2jd.; the product of this and 87,887 is ;f 66,830. 

The equivalent number of normal weeks' work is * 5^* ^— = 47. 
^ 66,830 ^' 

Hence we may conclude that, if our basis of calculation is correct, 

five weeks was the average lost time at that date. 

This is not the method adopted in the General Report of the 
Wage Census ; there the total paid in 1885 is divided by the 
number employed in a given week in 1886. This number is 
certainly too small, less than the number dependent on the 
trade, and as might be expected gives on analysis absurd 
results in some cases. It is to be noticed that the method here 
described cannot be employed in those few industries which the 
employes are able to leave in the slack season in order to earn 
wages in other trades which may then be exceptionally busy. 

Since there is no reason why the number absent through 
sickness should differ in the busiest week from the average 
number so absent, it is clear that the estimate we obtain for 
average lost time (five weeks in the wool industry) is in addition 
to the average time lost through sickness ; this may often be 
estimated from the returns of friendly societies. 

In the corresponding French wage census, of which the 
results were published in 1898,* an estimate of the number of 
days' work obtained in the year is formed on a different basis. 

* Saiaires ^t Durh du Travail^ 1897, pp. 15, 16. 
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The data collected were — (i) The variation each month of the 
personnel in each industry, which is found to average 4 per cent. 
TheFrezioh for the year — that is, for each 100 employed, 96 
metiiod. are found who have been in the same establish- 
ment for as much as twelve months : (2) The differences between 
the maximum and minimum numbers employed in each estab- 
lishment month by month during the course of a year, which 
are found to average 19 per cent, of the (? average) personnel. 
From this we may perhaps draw the conclusion that, on an 
average, half this number, at least, are in general out of work : 
(3) The number of different persons who have been employed in 
each establishment at one time or other in the year ; this is 
found to be 140 for each 100 permanently employed, from which 
the legitimate conclusion is that the average number of unem- 
ployed is not so much as 40 in 140, />., 28 per cent. These .two 
percentages, 9 per cent, and 28 per cent., are taken to be the 
inferior and superior limits of average lack of work. This in- 
formation is more detailed and perhaps more reliable than that 
on which the method, used above for the English figures, is 
based. Data obtained from syndicates of French workmen 
indicate about 20 per cent, as the average want of work ; the 
English figures obtained by the method described above from 
the whole wage census yield about 1 2 per cent. 

This somewhat lengthy discussion on the few questions 
included on the first page of the form is a good illustration of 
the necessity of considerable preliminary study before a blank 
form can properly be drawn up. Space does not allow a 
detailed criticism of the rest of the form. 
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Section 3. — The Work of the Labour Department. 

The Labour Department of the Board of Trade was founded in 
1893 ; its functions are to collect and publish information, chiefly 
The Labour Statistical, relating to the economic conditions of 
Dopartmoni workpeople, and the state of the market for labour. 
Its work lay almost entirely in virgin ground ; new sources of 
information had to be tapped, new methods developed. While 
it was untrammelled by tradition, it could avail itself of the ex- 
perience of the Board of Trade, and was already in touch with 
a widely extended organization. Under these circumstances 
it was soon able to attract a comprehensive and continuous 
supply of valuable information ; and the methods by which it 
accumulates and compiles its statistics should be interesting and 
instructive to all those whose business it is to work in any 
statistical field. 

The figures which are received periodically by the Depart- 
ment are published monthly in the Labour Gazette. Here the 

first article each month is on the " State of Employ- 
ment." As before we must first consider the 
question, What are the exact objects of the investigation 
of which the results are here published? They are to find 
out how many persons are out of work in each trade and 
district, what percentage they form of all dependent on each 
industry, and how this percentage changes month by month 
and year by year. The next question is, How much of this 
can be discovered, and, if we cannot measure these numbers 
directly, what are the best allied quantities to measure? 
Since no universal register is kept of the unemployed, it would 

PoMibie seem easier to estimate the number employed, since 
mMsnroments. ^n employer can generally state how many work- 
people he has at work at any given time. If we cannot discover 
the number of men at work, we may perhaps be able to find the 
number of machines, furnaces, mines, &c., at work, and deduce 
the number of men employed with them, and thence the number 
of unemployed ; or we may find for how many hours work was 
carried on in a factory or a mine ; or we may even go a step 
further back and find the amount of goods produced, and thence 
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estimate the other quantities ; or we may learn the total amount 
paid in wages. These are the numerical methods ; but there are 
others, useful if not so exact. We can obtain reports as to the 
condition of employment in the various districts or industries, 
not in numbers, as is generally necessary, but with descriptive 
adjectives, — such as busy, slack, improving, much the same, — 
which may lead to numerical estimates, or may serve to check 
results. Lastly, organizations for facilitating employment may 
send in returns of the applications made to them. Nearly all 
these methods are in use at the Labour Department. 

Next, who possesses the necessary information ? As regards 
the number unemployed, the only registers kept arc those of trade 

unions, to whose secretaries inquiries should be 
addressed. The figures so obtained will naturally 
only relate to those sections of an industry where trade unionism 
exists. As regards the number employed, the masters are the 
authorities, and forms must be sent to them asking the numbers 
at work day by day, or at longer intervals. With respect to the 
number of machines at work, the number of shifts, and the 
total wages paid, the masters again have the information. For 
the amount produced, the masters, or in some cases officials to 
whom they make returns, can suppl)/* the facts. For general 
information as to the state of employment, some presumably 
competent person, in touch with all the factories in an industry, 
or all the trades of a district, must be impressed to forward 
periodical reports. The Labour Department is in touch with a 
great number of such correspondents, many of them connected 
with the trades councils of their towns. 

The question as to whether the information will be given 
impartially and willingly need not detain us long in this case ; 
for, generally speaking, the returns are simply automatic copies 
or registers of known numbers, and would only be partial if 
wilfully falsified ; and since the returns are made periodically, 
the persons concerned regard them as a matter of course, and, 
once they have commenced, continue willingly to forward the 
requisite figures. 

By the courtesy of the Labour Department I have been able 
to obtain copies of most of the forms in use. There are some 
forty in all, each suited to some special industry or method of 
investigation. 

It must be remembered that the Labour Department had to 
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form its own intelligence organization, and initially was obliged 

The formation ^° apply to persons able to give information, just 

oruiintemgoiloe as any private investigator would. A connection 

department, j^^^ therefore to be established with trade unions 

and other societies, and with manufacturers ; and, when a nucleus 

had been formed, continual efforts were necessary to extend the 

organization in all directions. One or two of the circular letters 

written for this purpose are given on the following pages, since 

they are typical of the method which investigators must employ 

to enlist the help of possible informants who are uninterested. 

The points to notice are: — (i) The statement of the exact 

purpose for which the information is wanted ; (2) the simple 

and explicit direction as to what is to be done by the 

informant ; (3) the undertaking that the information will not 

be used in any way that can do, or appear to do, him injury. 

Here is one of the earlier letters, opening a connection : — 

Labour Department, 1894. 

Dear Sir, — The Labour Department of the Board of Trade, which 
is charged with the duty of collecting periodical statistics as to the 
condition of the Labour Market, is desirous of obtaining fuller informa- 
tion from month to month with regard to the state of employment in the 
Pig Iron Industry. For this purpose, the Department would be glad 
to receive monthly information from a large number of the employers 
in the United Kingdom as to the number of furnaces in blast and the 
numbers of workpeople employed, on the average, at each furnace. 

I shall accordingly be glad if you will be kind enough to assist the 
Department in making this inquiry complete by filling up and returning 
to me before the 4th of May the enclosed form. Postage need not be 
prepaid if the reply is addressed to " The Commissioner for Labour " 
at the address given above. 

The results of the inquiry will not be published in such a form as to 
render possible the identification of particular returns. — Yours, &c. 

When the Department had organized its work, and tabulated 
and published some of its returns, the next step was to endeavour 
to achieve completeness. When many are known to have given 
information, the more cautious will be encouraged, the less ener- 
getic be ashamed to be less public-spirited than their neighbours, 
and the critical anxious to correct mistakes. The first of the 
following letters, which is used for general purposes, takes ad- 



44 ELEMENTS OF STATISTICS. 

vantage of these tendencies, and the second is another excellent 
example of the method of extending the organization : — 



1895. 

Dear Sir, — I am forwarding herewith a copy of the " Labour Gazette" 
for the current month, and beg leave to draw your attention to the article 

therein dealing with the state of employment in the 

The Labour Department is very desirous of making the information 
contained iii these monthly reports as complete as possible, and trusts 
you will kindly assist by filling up and returning the enclosed form. 
You will notice that the form is of a very simple kind, and one that can 
readily be filled up without much trouble. 

/ may add-that Returns are regarded as strictly confidential and are 
only used to produce general statistical results in which the identity of 
individual returns is lost, — I am, &c. 

March 1895. 

Dear Sir, — This Department has for some time past received 
monthly Returns, both from the Dock Companies, and the Ship-owners 
who do their own unloading work in the port of London, with regard 
to the number of Dock Labourers employed. These Returns are 
collected with a view to throwing light on the periodical fluctuations in 
the employment of this class of labour ; but the figures are published in 
a general total and not in such a way as to make possible the identifi- 
cation of particular firms supplying the information. The article on 
page 36 of the enclosed Labour Gazette will show you the use made 
of the Returns. 

Hitherto no exact information has been obtained with regard to 
employment of labour at the wharves, and you will readily see that the 
addition of such information would very greatly increase the value of 
the statistics. The managers of several of the most important wharves 
on both sides of the river have been good enough to promise to make 
monthly Returns ; and I should be greatly obliged if you could see 
your way to assist the Department by supplying the information speci- 
fied on the enclosed form, not later than the date there indicated. 

You will observe that a form is provided for the daily number of 
labourers employed, and, alternatively, for the average weekly number. 
The daily number would, on the whole, be the most useful for the pur- 
poses of the Department; but if for any reason you cannot see your 
way to supply such detailed information, a weekly average would be of 
value. — lam, &c. 
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Another letter may be given as serving to encourage those 
who have already engaged in the good work. It will be noticed 
that, though now more concise, it is still insinuating. 



Agricultural Labour in January, 

Dear Sir, — I am instructed by the Commissioner for Labour to ask 
you to be good enough to favour this Department with replies to the 
questions on pages 2-4 of this form, by Friday, 4th February 1898. 

I beg at the same time to thank you for your kindness in send- 
ing answers to questions put to you by the Department on former 
occasions. — Yours, &c. 

These letters are well worth noticing because they have 
assisted to build up a very efficient organization for information 
out of nothing, and have succeeded in eliciting answers from 
uninterested men of business, who are not given to spending 
time and trouble on unremunerative labour. 

Please forward this Return to the address on the back not later than the 
Y\{^ of the month succeeding that to which it relates. No postage 
need be paid. 

Return of State of Employment 

in Month of 189 

Name of Society 



Total number of members in Society at close of month 

Number receiving out-ofrwork pay in last week of month 

(Do not include members on strike or locked out.) 

State, if possible, number of members entirely unemployed but not 
receiving benefit in last week of month 

State of employment for month _. - 

If any dispute, change in wages or hours of labour has occurred, please 
say, and the necessary forms will be forwarded at once 

Remarks 



Signed. 



Secretary, 

Date 1 89 
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The form given above is that issued to trade-union secre- 
taries, and it is by its means that the only perfectly definite 

measure of want of employment is obtained. It 
should be remembered that it is filled in monthly 
by the same official, and requires no special explanation. Since 
the first attempt to draw up such a form is apt to present 
difficulties, this may be noticed in detail. First, we find an in- 
struction as to the way it should be returned. Most forms are 
provided with a printed envelope to save trouble and mistakes, 
and postage is paid by the investigator. Next comes a brief 
B^.^nni,aM^ of heading and the date, and then the name of the 
vnompioyment society, an item used for reference and further 
^**'*"*' inquiry, but not for publication. Now we need 
to know chiefly the percentage unemployed, but secondarily 
the total number. The questions most easily answered should 
be asked, and the calculations done at the central office, for the 
trade-union secretary may make arithmetical mistakes. Again, it 
is not the numbers day by day that are asked, for they are hardly 
known ; nor week by week, which would give trouble ; nor even 
the average for the month, for that might lead to guess-work ; 
but a definite day or week is decided on, the same for all trades. 
For purposes of comparison trade with trade, or month with 
month, this is found sufficient. 

We notice next an important point connected with the defini- 
tion of Unemployment, and also an illustration of the necessity of 
Deflnitionof Studying the figures at their source. It is not 
mwmpioyment. stated explicitly in each Gazette whether men on 
strike or locked out are included as " unemployed " or not. A 
reference to this form shows that they are not included, and, 
therefore, before conclusions are drawn as to the amount of 
work obtained year by year, the excellent statistics relating to 
labour disputes given in another part of the Gazette must be 
studied. 

All members of trade unions out of work do not at once 
receive " benefit," the technical term for any payment of union 
funds, and therefore a correction must be made in some cases 
for those who have not yet come " on the society." The 
number is likely to be known accurately to the local secretary ; 
but if the form had to be filled in at a London office, say, for 
the whole Amalgamated Society of Engineers, the number 
would not there be known. At this point we are left in some 
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doubt as to the methods of the Department ; for we are not told 
whether these forms are sent to all local secretaries, or whether 
they are filled up centrally for whole districts, or what additional 
information is obtained from other sources. 

The next line is for a qualifying adjective which will serve 
to check labour correspondents' information, and to indicate 
whether the last week was typical of the month. 

When an organisation is ready for a special purpose, any 
secondary use may be made of it that will not vitiate its chief 

Bvtaidiaiy end* The Labour Department is always anxious 

^'***™^*®- to hear of all changes of wages, and in general has 

to detect their existence for itself; hence no opportunity of 

obtaining such information is lost, and this widely circulating 

form is used for the purpose. 

Lastly, if the informant is an intelligent man and acquainted 
with the methods of the Department, it is well to give him an 
opportunity of adding any relevant remarks that may occur to 
him. On the line " Remarks " might be given some reason for 
any exceptional numbers or information as to trade prospects 
which might furnish a clue to the Department for other investi- 
gations. The paper is signed and dated to show that it has 
been filled in officially and at the right time. 

These forms are not always filled in and returned to date ; 
sometimes special application has to be made for them, and 
occasionally the necessary numbers have to be interpolated from 
other sources. 

The next form given is more complicated, and illustrates 
two of the methods mentioned, finding the number of persons 
employed and the number of days worked. 
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Please fold and return this form by the agth December to Ike address 

given on back. 

EMPLOYMENT AT COAL MINES. 

Individttal Returns are regarded as confidential and not published 

separately. 

County in which Pits are situated 

Name of Firm or Company 

Postal Address to which form should be sent 



Number of shift, usually worktd in «ich J4 hour, | 


NiunaotPlUorSBmi. 


*«/tofO»l 
™Hou^" 

".?-:;!,■■ 

"Manulae. 
CotU.' 


plS£. 


hewn and 

&? 

end ins 
1897- 


'^ny'T- 
the diiyi staled 
in Col. 5 >«>• 

.Slfin'cSTe 

deducted by 




^ih 

Dec. 

,896. 


No. of "Other Workpeople" 

Piu und not included 
above, uid Nombet of 
Day. worked by them in 






No. 


U.y.. 


Dayi. 1 Hour.. 



• The number 0/ Workpeople should include all Men and Boys, &c., 
employed in and about the Pits, except Clerks and Managers. The number 
should also include " Drawers " and others who may be paid by " Hewers." 

+ The number of Days, whether full or nol, on which Coal was " hewn and 
wound" should be inserted in this column. If on any of these days short 
shifts only were worked, the extent of the time lost should be staled in 
Column 6 ; but it should be left to the Labour Department to deduct from 
the figures in Column 5 the Short Time, if any, given in Column 6, If the 
time worked on Saturday is usually shorter than on other days, no reduction 
should be made on that account. 

X Short 7V/««.— Please state here any special reasons for Short Time : — 
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The form has one or two pecuh'arities. A colliery company 
has often several pits, so that in the first place it is not obvious 
PMidtaritidt ^t which address this information can be most 
of font readily given, while it is important not to waste 
time and trouble in forwarding from office to office; and, in the 
second place, not only will work be done for different lengths of 
time in different pits, but also there will be variation from seam 
to seam in the same pit. This was not recognised in the first 
form sent out, but a second had to be sent distinguishing the 
pits and asking for subsidiary information. In this form may 
be seen the mollifications that must be introduced to suit the 
questions to particular industries. In a colliery the factor which 
determines the state of employment is generally not the number 
employed, but the number of days* work, the number of days 
" coal is wound." A colliery at full work may make four, five, 
or six days a week, or eleven days a fortnight (leaving one day 
a fortnight for repairs), according to the custom of the district 
and the state of the trade, and there may be two shifts or three 
in the twenty-four hours. If work is slack, the number of shifts 
per fortnight, which is really the essential quantity to know, will 
be diminished, and the alteration will very likely affect all em- 
ployes equally. Again, since the colliers are not all at work at 
once, the question is not " how many are at work ? " but " how 
many are paid?" the pay-day, once a fortnight, or however 
often it may be, being perhaps the only time when all the 
workers are together. The number at pay-day is, therefore, the 
number employed in the mine, a quantity varying as new seams 
are opened or old seams worked out, and the number of days 
on which there is work is the factor which determines the 
amount of work obtained per workman. Notice that the ques- 
tions, number of shifts per day, number at pay-day, and days at 
work, are precisely those which the manager will find easiest to 
answer. Since, however, days are of different lengths, depending 
on the demand for coal, the good working of machinery, the 
presence of the necessary trucks, and the efficacy of the railway 
arrangements for clearing the yards, and other circumstances, it 
is necessary to know whether on any working days, winding 
stopped early or the shift was shortened ; hence the question in 
column 6, which will give the manager more trouble. 

In the form relating to dock labour, the question is simply, 
how many are employed, not at the end of the month, but 



50 ELEMENTS OF STATISTICS. 

day by day ; for labour at the dock fluctuates violently and 
continually, as may be seen from the monthly diagram in the 

Foxmi for other Labour Gazette, On that relating to the Surrey 
indiiftrioa. Commercial Dock, the question is again modified 
to suit, it may be supposed, a special method of bookkeeping, and 
reads, " What is the amount of wages paid at the end of each 
week?" Wages are perhaps a better measure of dock labour 
than number employed since the number of hours worked varies 
continually, men being taken on for long or short hours, but the 
rate of pay varies little. On both forms there is a question 
as to any special holidays or other events aflfecyng work. On 
the form sent to pig-iron works, the question asked is as to how 
many furnaces are," in blast," or have been "blown out," or 
re-lit ; on that relating to steel, iron, and tinplate works, the 
information required is " the number of shifts " worked in four 
weeks. Another form is to be filled up by a single correspon- 
dent for a wide district, and the returns are entered under the 
headings — Number of mills (i) running full time and giving full 
employment; (2) running full time, but giving only partial 
employment ; (3) running short time ; (4) stopped. 

Another instance of adaptation of the form to a particular 
industry is afforded by the inquiries as to agriculture. In this 

Empioymait In case the number of employers is very great, they 
agrioaitnre. j^^e very much scattered, and little used to statisti- 
cal inquiries, and the labourers are for the most part uncombined. 
On the other hand, in the majority of villages agriculture is the 
predominant industry, every one knows all about every one else, 
and any one intelligent person can give an accurate account of 
the state of labour in his district. It is necessary then to arrange 
with a labour correspondent, a farmer, or a member of the Village 
Council, or the chairman of the District Council, and to apply to 
him monthly for information. 

Only one general organization is necessary for the collection 
of the three groups of figures wanted by the Department for all 
industries. These groups relate to the state of employment, which 
fluctuates continually ; changes of wages, which in some cases 
take place at stated times, in others occur irregularly; and strikes, 
which may begin at any time and last a long while. In the 
case of agriculture, the three groups of questions are placed on 
a single form, though the practice has changed a good deal since 

1893- 
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One form, that in use in 1894, asks for complete details as 
to wages and the number employed at the harvest, with a page 
devoted to strikes, and two spaces for remarks on the weather 
and on things in general. 

The next, July 1895, deals with haymaking, strikes, and 
wages. The questions here are as follows : — 

1. Were there any able-bodied agricultural labourers in irregular work 

in your Parish during the month of June ? 

2. If you answer question i in the affirmative, can you give the numbers 

and state about what proportion those in irregular work were of the 
total number of able-bodied agricultural labourers ? 

3. If you can give the particulars asked for in questions i and 2 for any 

neighbouring Parishes, kindly do so. 

4. What daily or weekly wages are being paid in the district to the 

regular farm hands during haymaking ? Also state how much is 
paid for overtime and what perquisites are given, such as food, 
beer, &c. 

5. What daily or weekly wages are being paid in the district to extra 

hands during haymaking ? Also state how much is paid for over- 
time and what perquisites are given, such as food. 

6. Were there any agricultural strikes in your neighbourhood during 

June ? If so, please give the following particulars with reference 
to each strike : — 

(i) The date ; (2) The cause ; (3) The duration ; (4) The 
result ; (5) The number of men affected. 

There are differences in the forms for nearly every month in 
the year ; and the questions have been modified as experience 
suggested till they are finally as follows ; — 
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Union 



Parish 



I.— STATE OF EMPLOYMENT. 

1. Approximate number of able-bodied agricultural labourers in 

Parish. 
(If this question has been recently answered by you, you need not repeat your reply.) 

2. Were there any able-bodied agricultural labourers in irregular work 

in your Parish during the month of January 1898? 

3. If so, can you say about how many were in irregular work in the 

last week of January 1898 ? 

4. Was employment more regular in January 1898 than in January 

If you can give the above particulars for any neighbouring Parishes, or 
for the whole of the Poor Law Union, kindly do so. 



II.— CHANGES IN RATE OF WAGES IN 

JANUARY 1898. 

Changes in Weekly Cash Rates of Wages of Ordinary 

Labourers in January 1898. 

{N,B, — Ordinary labourers do not include foremen, shepherds, cattle- 
men, carters, waggoners, teamsters.) 



Locality in which Change took place. 
(State whether Change applies to 
the whole County, or to which 
Poor Law Unions or Parishes 
within it.) 


Approximate Num- 
oer of Labourers 
who have had a 
Rise or Fall in 
Wages in Janu- 
ary 1898. 


Cash Rates of 
Wages per Week. 


Please sute in this 
column for com- 
parison what the 
Kate of Wages 
was b January 
1897. 

• 


Before 
Change. 

s. d. 


After 
Change. 






s, d. 





Name of Correspondent _ 
Postal Address. 



THE WORK OF THE .LABOUR DEPARTMENT. 53 

For convenience of printing the exact spacing allotted for 
answers has not been introduced in these reprints. In the 
agricultural forms a great many square inches are allowed for 
such an answer as " Yes " ; in the others the space is allotted in 
proportion to the amount of information expected. The ques- 
tions as to wages on this form will be alluded to presently. 

The information collected by the Department as to trade 
disputes is detailed and important. The principal questions 

in the investigation relate to the causes of dis- 
putes, the methods of settling them, and the total 
loss of money to workpeople and employers. Of these the 
first two are not statistical questions, but are inserted because 
the inquiry has three objects: — (i) A general examination 
of the causes of and remedies for strikes ; (2) an inquiry as 
to the course of each particular dispute, so as to bring the 
Conciliation Act into operation if possible, or by disseminating 
information to assist an arrangement or compromise ; (3) the 
collection of statistical information. 

The Department is dependent on its own alertness for 
knowledge of the existence of disputes, and its chief sources 
of information are the daily press (London, Provincial, and 
Trade) and special local correspondents, who are expected to 
inform the Department, directly work is stopped owing to a 
dispute, on a special form. 

As to the question, Who knows the facts? obviously the 
only people are the employers and employed ; and since they 
may take different views on all subjects connected with the 
dispute, both parties must be addressed. In this investigation 
partiality and bias in the answers will be at a maximum ; the 
questions must be restricted as far as possible to facts about 
which two opinions are nearly impossible, and any questions 
which will not be answered willingly should be omitted. 

On pages 54, 55 is given the form sent to Trade Unions in 
1895, on pages 56, 57 that used in 1897, and on page 58 the 
letter accompanying the latter. 
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LABOUR STATISTICS.— 
Questions as to 



Questions. 



1. Name of enaployer, firm or company, and trade - 

[Where more firms are involved than one, or the strike 
or lock-out has been general over a locality, the -number 
of employers or firms to be stated as nearly as possible.] 

2. Cause or object of strike or lock-out - 

3. Whether strike was ordered or approved by trade 

union 

4. Date of commencement and termination of strike 

or lock-out - - - - 1 - - 

5. Result of strike or lock-out . - - - 

[If dispute has been respecting increase or reduction of 
wages or hours of work, state exact amount of increase or 
reduction (if any).] 

6. Mode of settlement ------ 

7. Number of persons affected . - - - 

(i) Number directly on strike or locked out 

a. At beginning of dispute 

b. At end of dispute - - - - 

[Distinguish between adult men and women, and 
apprentices or other young persons. ] 

(2) Number employed in factories or works 
where strike or lock-out occurred, and who 
were thrown out of work thereby, but were 
not directly on strike or locked out - 

8. Estimated total amount of wages earned in a full 

week (exclusive of overtime) by those affected 

immediately before and after strike or lock-out 

a. Directly affected - - - - 

h. Indirectly affected - - - - 

9. Number of those on strike or locked out who 

belong to trade unions 

10. Amount expended in support of persons on 

strike or locked out 

a. By union 

h. By other strike fund 

11. Number of persons who "went in" or returned 

to work before termination of the dispute 

12. Please suggest means of settling or preventing 

labour disputes 



Answers. 



I. 



2. 



4. Date of commence 
ment. 

5- 



6. 
7- 



8. Before Strike or 
Lock-out. 

a. 



10. Amount per head 
per week. 



II. 



12. 
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STRIKES AND LOCK-OUTS. 
Strikes and Ix)ck-outs. 



1895. 



J 



Answers. 


General Observations. 




I. 




• 


2. 
3- 




Date of termination. 


4. 

5- 




• 


6. 
7. 











After Strike or Lock-out. 


8. 




a. 






b. 


9- 




Total amount expended. 


10. 
II. 






12. 
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Information for the use of the Labour Departntent, Board of Trade, 44 Parliament St,, S, IV, 

STRIKES AND LOCK-OUTS. 

Part I. 
[To be forwarded as soon as possible, without waiting for settlement of dispute.] 



Questions. 


Answers. 


1. Name of Trade afTected - 

2. Number of Firms involved 

[If an Employers* Association 
is concerned in the dispute, 
please give the name and ad- 
dress of its Secretary. 

If there is no such Association, 
please give the names and ad- 
dresses of the principal firms 
involved in the dispute.] 

3. Cause or object of strike or 

lock-out - - - - 
{Enclose copy of any application 
or Notice connected with the 
origin of the dispute,) 

4. Date of the first day on which 

the workpeople were absent 
from work tnrough strike or 
lock-out. 
(If notices were handed in, give 
also date of notice. ) 






Occupations. 


Men. 


Women. 


Apprentices or other 
Young Persons. 


5. Slate occupations and numlier 
of workpeople (Unionists and 
Non - unionists) directly on 
strike or locked out. 

5a. State occupations and number 
of other workpeople (Unionists 
and Non-unionists) employed 
in above establishments who 
were thrown out of work owing 
to the strike or lock-out. 


{ 








Total Number of workpeople 
affected* - 











* If any other workpeople were affected, respecting whom you can state no exact figures, 
please give, if possible, the name and address of some person who could do so : — 
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InfomuUioH for the use of the Labour Department ^ Board of Trade^ 44 Parliament St., S, IV, 



STRIKES AND LOCK-OUTS. 
Part IL 
[To be forwarded as soon as the dispute is terminated.] 



1897. 



i 



Questions. 


Answers. 


6. Date of termination of strike or 

lock-out, i.e. , the last week-day 
on which the workpeople were 
on strike or locked-out, or the 
date when the places of the 

strikers « ere filled up. 
(If there was no definite end to the 
dispute, please state approximately 
when it may be regarded as practi- 
cadly closed. 

7. Result of strike or lock-out 
(Enclose copy of any printed or 

Tvritten agreement that may 
have been made.) 

8. Describe the steps taken which 

resulted in the settlement, 
giving the names of any or- 
ganizations or persons who 
assisted in bringing this about. 


• 



If the result involved a Change in the Rate of Wages or Hours of Labour, give 
the following particulars Tor all workpeople whose wages or hours were changed, 
whether Strikers or not : — 



Occupations 

affected by Changes 

in WaRes or 


Number of 

Workpeople 

whose Wages or 

Hours were 

changed.* 


Date from 

which Change 

takes effect. 

• 


Bate of Wages t 

in a Full Werk, 

exclusive of overtime. 


Bonn Of Labour 

in a Full Week^ 

exclusive of meal times 

and overtime. 


Hours. 


Before 
Change. 


After 
Change. 


Before 
Change. 


After 
Change. 

















* This is not necessarily the number on Strike or Locked out. 

f When there has been a chanse in piece rates, please ^ive the percentage increase or decrease in {nece 

prices, and approximately the average earnings in a full week (exclusive of overtime) before and 

after change. 

Signature 



Address. 



Date 
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Labour Department, Board of Trade, 
44 Parliament Street, London, S.W., 1897. 

Dear Sir, — The Labour Department of the Board of Trade is 
desirous of obtaining a complete and accurate record of Strikes and 
Lock-outs, and Changes in Rates of Wages and Hours of Labour in 
the United Kingdom as they occur, for publication in the Annual 
Reports presented to Parliament, and also in the "Labour Gazette," 
which is issued monthly. 

These statistics are collected and published by the Department in 
pursuance of the following Resolution adopted by the House of 
Commons on the 2nd March, 1886 : — 

'*That in the opinion of this House immediate steps should be taken to ensure 
in this country the full and accurate collection and publication of Labour 
Statistics." 

As the value of these statistics is greatly increased if the parties 
concerned co-operate with the Department by supplying accurate 
information, I should be glad if you would kindly answer as many as 
possible of the questions asked on the inner pages of this form so far 

as they relate to the 



If from any cause you are unable at present to answer the questions 
on Part II. of the form, will you be so good as to fill in and return 
Part I. at once, and send Part 11. as soon as it is possible to do so. 

I have to add that any information you may be good enough to 
furnish will be used solely for statistical purposes, and will not be 
published under your name. 

A circular asking for similar particulars is addressed to the employer 
affected by this dispute. — Yours faithfully. 



Chief Ltibour Correspondent. 
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The letter given with the later of these forms is a particularly 
careful one, showing the object of the inquiry, promising secrecy, 
and guaranteeing an impartial survey by the statement that 
similar forms are sent to employers and workmen. The 
forms addressed to employers are precisely similar in general 
appearance. 

The main difference between the forms is that the later is 
divided into two parts, the first of which can be filled up directly 

Ghangeof work is Stopped by a dispute, so as to give the 
*"™- . Department a clue as to its magnitude and cause. 
The second part is detachable, and is to be preserved till the 
dispute is ended, and then forwarded. The advantages of this 
method are that the Department has early information as to 
the exact facts about the strike, and that the figures are given 
while the facts are fresh in the mind of the informant, whereas 
at the end of a long struggle, they might have been forgotten. 
Should the second part not be forwarded, the Department would 
of course write for it, or send a duplicate. 

Question i on the earlier form is modified and split into two 
on the second. Question 2 on the later is simpler than the 
parenthesis of question i on the earlier, but asks for the more 
important information as to employers' associations, which will 
lead to the blank schedule being sent to the addresses given. 

Question 2 of 1895, 3 of 1897, is the same on all four forms 
(the two to trade unions, and the companion forms to employers); 
it is not a statistical question, and probably leads to vagueness 
and to contradictory statements on the part of employers and 
employed; but the new parenthesis ("enclose copy, &c.") is 
important, for it leads to definite statements about which there 
can be no dispute. 

The next question on the earlier form had of course to be 
altered for the new double sheet. Since the chief statistical 
information needed relates to the exact number of days* work 
lost, it is necessary to know exactly the date of the commence- 
ment of idleness ; this day is therefore very carefully defined on 
the later form, as not that on which notices were sent in or any 
preliminary steps taken, but that of the actual commencement of 
hostilities. Question 6 (date of termination) is also carefully 
worded. The date of notices (question 4) gives useful sub- 
sidiary information. 

There is considerable difference between question 5 and $a 
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on the later and the corresponding question 7 on the eariier. 

The only difficulty in using the information 

"* * ' obtained from the new form arises at this point. 
The number affected by a strike, especially the number indirectly 
affected, changes continually, rising gradually to a maximum and 
then rapidly decreasing as the dispute draws to a close. The 
1895 form did not give enough information, for the numbers at 
intermediate dates cannot be deduced from the numbers at the 
beginning and at the end, so that we have not the necessary 
data for determining what we chiefly want, the number of days* 
work lost (/>., the sum of the numbers of days lost by each 
person affected). In the case of a long dispute, however, this 
information is revised monthly at least, as is shown by the 
monthly report in the Labour Gazette, 

The chief improvement in the new form consists in 
allotting separate spaces for different occupations. Several 

The iprMding classes of workpeople will probably be affected in 
of a itrike. different ways by a strike in a complicated in- 
dustry. Thus if the cotton spinners are on strike, very likely 
the carders will go out either on a grievance of their own or 
from sympathy. The spinners* assistants, the piecers, are at 
once thrown out of work, as are also the overlookers of the 
mules. As the strike continues all the departments of the 
spinning mill will be closed, one after the other. In the form 
four lines are allowed for those directly affected, eight for other 
classes unwillingly on strike. A great dispute, however, is not 
limited in its effect to the spinning mills. The supply of yam 
falls off and the weavers are stopped ; then the export trade is 
diminished, and dock labourers and sailors are thrown out or 
work, and so the influence of the strike spreads. It is out of 
the question to estimate completely these indirect effects ; but 
in order to trace them as far as possible, space is given on the 
second form for the address of any one who can give information 
about them. 

Question 6 on the earlier form, "Mode of settlement?" has 
been expanded considerably in the later one, since the question 
cannot well be answered in a single word, and the exact details 
are important for the non - statistical part of the inquiry. 
Question 5 in the earlier has also been altered ; the important 
request for printed agreements is added, and the parenthetical 
part has been grouped with question 8 so as to form the new 
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question 9. This alteration, the same on employers' and work- 
men's forms, is worth special attention. The distinction between 
" directly " and " indirectly affected " is practically useless, and 
difficult to maintain in filling up the form. It is far more im- 
portant to distinguish the different classes of workpeople, as can 
be done in the nine lines of the new form. Again, it is difficult 
to state the "total wages before and after," and the question 
leads to inaccuracies ; the new question 9 is far better, for it 
is precisely that easiest to fill in, and most useful when done, 
and is in the exact form wanted by the Department for its 
register of changes of wages and hours. It is important in 
this question 9 to include all, whether on strike or not ; hence 
we have the italics in the heading and a footnote to the second 
column. This footnote could be improved, for at present the 
wording is a little obscure, and the notice might be put with 
advantage in the heading. 

There remains a series of questions which have been dropped 
out in the later form. It may be supposed that it was found 

that the answers were not accurately given, that 
the inclusion of the questions overloaded the form, 
and by tending t6 inaccuracy in the answers led to inaccuracy in 
other details ; while in cases when it was possible to obtain 
correct answers, it was found best to do so by other methods or 
a separate inquiry. There are two sets of questions : trade 
unionists are asked the amount they spent ; employers the value 
of capital left idle. 

Question 3 on the older trade-unionist form has nothing to 
do with the statistical inquiry. Question 9 simply affects the 
relation of unionists to non-unionists, may lead to exaggerated 
answers, is not wanted for tabulation, and is apart (rom the main 
inquiry. Question 10 belongs properly to a separate inquiry ; the 
total might not be known to the secretary who fills in the form, 
and the amount expended by unions on " strike benefit " is com- 
piled annually from other sources. The question is too compli- 
cated to be placed with advantage at the end of a long form. 
Question 11 is too vague to lead to the information wanted, 
though knowledge of the facts is needed. Question 12 is hardly 
likely to yield any results worth having, since all possible means 
have long been canvassed. The answers arc, however, tabulated 
in the Report on Strikes o/iSg4 (C — 790i), which may be studied 
with advantage in connection with these forms, and some of 
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them may be quoted : — " Give in to the wants of the men, so 
that they are not extraordinary." " Abolish capitalism." " No 
means have yet been discovered." " Make all men Christian." 
" Fair argument." " A little more common honesty on the part 
of employers " (pp. 229-240). 

The questions omitted in the later, but present in the earlier 
employers' forms are subject to similar criticisms. The sub- 
division of question 9, distinguishing summer and winter wages, 
and the separate columns for hours, are only in the later form. 
A comparison of these two forms with any number of the Labour 
Gazette and the Annual Report on Strikes and Lock-outs of 1894 
will throw considerable light on the uses and difficulties of 
forms of inquiry. 
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Section 4. — Statistics of England's Foreign Trade. 

The original schedules which lead to many other statistics 
are interesting, but limits of space must restrict us to one more 
typical inquiry, that which leads to our statistics of foreign 
trade. 

In the population census the filling in of the form is com- 
pulsory and done by the householder ; in the wage census 
the answers were voluntary and given once and for all by the 
employer ; in the various inquiries undertaken by the Labour 
Department the answers are voluntary, but in many cases 
periodic, so as to become quasi-official. The method of collec- 
tion of import and export statistics is a blend of all these. 
_ There are three classes of persons who know the 

The lofbmiAiits. 

facts in question — the sender of the goods, the 
custom-house official through whose hands they pass, and the 
recipient or his agent. Circumstances decide that, in the case of 
exports from the United Kingdom, the exporter or his agent 
sends an account of the quantity and value of goods de- 
spatched to the Statistical Office of the Board of Trade ; that, in 
the case of imports, the receiving-agent hands over an account 
of goods to be landed to the custom-house officials, who verify 
the account, roughly if the goods are duty free, carefully if they 
are liable to duty ; and that, in the case of transhipment, the 
goods are treated in the same way as imports at the port of 
landing, and to some extent verified at the port of embarkation. 

The blank forms, being filled in by officials as part of 
their duty, or by agents thoroughly used to the task, need no 
covering letter, and may be made as complicated as necessary ; 
no questions are inserted but only blank tables. An examina- 
tion of the forms in use will show what are included as exports 
and imports in the Board of Trade totals, and what is the total 
amount of information available for tabulation. 

The quantities we wish to measure in this investigation are : 

the volume or weight and value of all goods which have an 

neqiuBaita exchange value, which leave our shores or reach 

and data. them from without, subdivided as regards classes 
of commodities and countries of destination or origin ; the values 
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being those at the times of loading or unloading. The quanti- 
ties we can measure are sharply distinct from these, being the 
records of values and volumes which reach the Board of Trade. 
We should therefore examine the forms to decide — (i) What 
part of imports and exports are recorded ; (2) whether the values 
are correctly given, (3) the quantities accurately registered, 
(4) the commodities accurately defined, (5) the countries of 
origin and destination accurately distinguished in the returns. 
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On reaching port the ship's master has to send in an 
Bzam^oior account, of which the following is an abridged 
specimen : — 



lafonnatlon. 



If Sailing Vessel 
or Steamer 



No. I. 
Port of X. 



STEAMER. 
REPORT No. 980.* 



Official No. 
No. of Register, 
Date of Registry, 



• 

Ship's Name. 


Tonnage. 


British or Foreign. 

If British, Port of 

Registry; if Foreign, 

Country to which she 

belongs. 


Number of Crew. 


Name of Master, 

and whether a 

British or Foreign 

Subject. 


Port or 
Place from 


British 
Seamen. 


Foreign 
Seamen. 


whence 
arrived. 


Marianne. 


700 


BRITISH. 

Toul.. 


12 


— 


H. Hind. 


Havre, 
France. 









Cargo. 



z. 

Name or 

Names of 

Places where 

laden in order 

of time. 



Havre, 
France. 



If any wreck 
fallen in with 
or picked up, 
to DC stated. 



a. 


3- 


Marks 


Nos. 


Pari 


s to 


COK 
AE 
KG 

FOT 


1392 

495/6 

340/9 

1/50 


AJ 
CK 


3/6 

I 


AC 


10 


KL 
ACD 


40 
20 


WD 


166 


O&D 


I 



Picksij^ and Description 

of Goods. Particalars of 

Goods stowed loose, and 

General Denomination of 

Contents of each Package 

of Tobacco, Cigars, or 

Snuff intended to be 

imported at this Port. 



London. — 6c» pkgs. 
68 pkgs. Merchan 



} 



70 cases Wine. 



5 cases Woollens in 
I case Brandy. 



Particulars of 

Packages and 

Goods (if any) for 

any other Port in 

the United 

Kingdom. 



Fruit and Peris 
dise. 



transit to Liver 



6. 

Goods (if emy) to 

be Transhipped 

or to remain on 

Board for 

Exportation. 



babies. 



pool. 



Name of 
Consignee. 



Smith. 



If 



Stores. 

Surplus Stores remaining on board, viz.-{ ^ ., ' T^lcco 

Number of Alien Passengers (if any) - Nil. 

Pilot's Names 

At what Station Ship lying - - - South Quay. 

Agent's Name and Address - - - C. J. C. 

I declare that the above is a just report of my Ship and of her Lading, and that 
the Particulars therein inserted are true to the best of my knowledge, and that I have 
not broken Bulk or delivered anv Goods out of my said ship since her departure from 
Havre, the last Foreign Place ot Loading. 

(Signed) H. HIND, Master. 
Signed and declared this 13th day of October 1890 
In presence of 

(Countersigned) 

Collector, 



* !*.«., 980th ship at X. since xst January. 

E 
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The goods for quick transit are passed at once, and a special 
form is sent to the Board of Trade similar in character to that 

on p. 67. The remaining goods are treated either 
**^ 'as dutiable or as duty-free articles. In the list 
before us, ten cases of wine are entered for home use, and an 
account is sent in to the Statistical Office ; sixty cases are ware- 
housed and another account (as to quality, quantity, and value) is 
sent in ; the whole are registered as imports. Twenty of the ware- 
housed cases are removed to another port and re-exported ; an 
account is sent, and they are entered as exports of foreign goods. 
Twenty are put on board ship as stores at the original port, and 
twenty more removed to another port for the same purpose, and 
of this the central office takes no account ; the remainder are 
removed to another warehouse, still in bond, and on leaving that 
will be treated in one of the four ways just mentioned. Other 
dutiable articles are treated in the same way. 

Goods not sufficiently described or not answering to their 
description are opened, their contents entered on a "bill of 
Ezamination or sight," and an account sent in. Private effects are 
8<**^"' separately examined, being described on a " suffer- 
ance " form ; if they are bona-fide personal goods no record is 
kept of them, except in the case of dutiable goods, which are 
treated as ordinary imports. If the dutiable goods are con- 
cealed, either among private effects or merchandise, and forfeited, 
they are not reckoned as imports. 

Bullion is entered on a separate form and kept distinct 
throughout the accounts. 

The duty-free goods, if for transhipment at another port, are 
sent there under seal, and barely examined ; they are treated at 

_ ^ the central office in the same way as dutiable 

Fro6 goods. 

transfer goods. The remaining free goods, which 
in general form the bulk of the cargo, are entered on such a form 
as follows, which is worth notice, for it is a specimen of the 
rough material from which our foreign trade figures are 
evaluated. 
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ENTRY FOR FREE GOODS. 



This space 

is for the 

ase of the 

OflBcers of 

Customs. 



Port 



Dock or Station 



Importer's Name. 



(No.. 



Examina- 
tion. 



Ship's Name. 
Marianne. 



Master's Name 
H. Hind. 



Rotation No. 
9S0. 



Date of Report 
13/10/96. 



Port or Place whence 
Havre, France. 



Marks and 
Nos. 



COK 1392 

AF 495/6 
KG 340/9 



FOT i/io 

» 11/5 
„ 16/20 



» 21/5 

» 26/30 

n 31/S 

n 36/40 

» 41/50 



AJ3/6 

CKi 



No. of Packages and Description of Goods, 
in accordance with the Official Import List. 



One Goods Manuf. N.O.E. Billiard 

Cue Tips - - - . 
Two Leather Shoes - 
Ten Cotton Manuf. Trimmings - 

Embroideries 
Piece Goods, not Muslins - 
Ten Gloves of Leather 
Five Silk Broad Stuffs 
Five Works of Art- 
Plaster Casts ... 

Statuary .... 

Pictures by Hand 
Five Books Bound 
Five Bronze Manuf. Ornaments - 
Five Metal Manuf. Ornamental 

Brass-headed Nails 
Five Silk Manuf. Dresses, Mantle», 

Trimmings . - - - 
Ten Goods Manuf. N.O.E.— 

Fancy Goods 

Horseless Carriage 

Brushes .... 

Glue 

Billiard Chalk - 

Hardware . - - - 
Four Stationery Ink - 
One Iron and Steel Manuf. 

Machinery, British, returned 



Quantity. 



10 doz. prs. 



300 yds. 
11,240 doz. pr. 



3 
4 cwt. 

3 cwt. 

4 cwt. 



3 cwt. 



Value, 



28 

58 
140 

280 

8 

12,316 

10,400 

380 

1,280 

10,200 

300 

38 
24 

1,816 

no 
160 

78 

no 

12 

116 

48 

24 



I enter the above goods as free of duty, and declare 
the above particulars to be true. 

Dated this 13th day of October 1896. 

(Signed) J. Jones, 

Importer or his Agent, 

The information so received is usually accepted at the 

central office without inquiry. It frequently happens, however, 

that the form is not properly filled in by the agent, the values 

veriftoattonof often being omitted. When this is so, it is the 

***»■ duty of the clerk at the port of entry to fill in the 

value, in accordance with a list of current prices with which he is 
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provided. It may happen that he has to appraise the goods on 
inspection, a process leading in some cases to great error, which 
is enhanced when not even the quantities are given. When 
there is a palpable error or omission in the form, or when the 
price appears out of the common, a query is sent from the 
central office to the port : e,g.^ with reference to such a form as 
that just given, the following correspondence might arise : — 

1. Pictures by hand, ;^io,200. Explain high value. Answer, 
— Correct ; invoice was seen ; pictures by Millet. 

2. Books bound: is weight or value incorrect? Answer, — 
Both correct ; advice seen ; old and valuable books. 

3. Goods entered as "goods manufactured, chip plaiting": 
explain nature, and state if description is correct Answer. — 
Correct ; wood shaving plaited and occasionally mingled with 
horse-hair, &c. 

4. Potatoes, 40 cwt, £62. Weight or value? Answer, — 
Value correct. Weight should be 400 cwt. 

Thus any unusual entries are liable to be checked and 
verified. 

In the case of goods not easily valued, or of miscellaneous 

goods not easily tabulated, errors must arise in this way ; and 

Pottiiiui^ or another error may enter if a clerk, who does not 

•'^'^'"■- wish to receive too many queries from head- 
quarters, enters at ordinary rates goods of exceptional value ; 
but when staple commodities and large quantities are involved, 
all the persons concerned will be familiar with the forms they 
have to fill, the prices will be known, and so in important cases 
errors will be at a minimum. The import total values, there- 
fore, are the sum of many quantities of various degrees of 
accuracy, and it is not difficult when looking through the list of 
items in the annual report to see which are specially liable to 
error. Such commodities as old books, works of art, goods 
where sale depends on the fluctuations of fashion, racehorses, 
and so on, have values varying from day to day, and their 
exact value in the balance of imports and exports cannot be 
determined. 

The quantities and values of exported goods are filled in by 
the shipper or agent, and sent to the central office 
within six days of the ship's clearing. The follow- 
ing is an abridgment of the form used : — 
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The forms for British and Irish goods are distinct from those 
for foreign, free and duty-paid, goods ; and there are distinct 
export forms for transhipments, which have already been regis- 
tered as imports. In these cases the specification and quantities 
are likely to be correct, but there are causes which may falsify 
the values. If they are to be subject to an ad valorem duty, they 
may be undervalued ; if they are adulterated goods, masquerad- 
ing as genuine, they may be over-valued. It seems hardly 
possible to estimate these errors. 

We are now in a position to define imports and exports 

DefiniUon of according to their meaning in the Board of Trade 

official importa Returns; as, for instance, when for 1895 the value 

an expo . ^^ imports is stated as ;^4 16,000,000, and of 

exports as ;^285,ooo,ooo, of which ;^6o,ooo,ooo are re-exports 

of foreign or colonial goods. 

This total for imports includes all goods landed through the 
custom-houses, including goods immediately shipped as stores, 
or returned from customers unused. Goods immediately re- 
shipped at the same or another port, or held in bond and then 
re-shipped, are included both as imports and exports. Bullion is 
not included, being given separately, nor cargo unlanded and 
so reported, nor personal luggage or private effects, except when 
duty is charged. The value reckoned is the nominal exchange 
value when or just before they are landed ; that is, their value is 
already increased by freight, but not increased by duty. 

The total for exports includes all goods entered on ships* bills of 
lading, does not include ships* stores or passengers* luggage, nor 
cargo unlanded and so reported, nor bullion, which is given separ- 
ately. The value is reckoned at the time they are put on board. 
Ships leaving our shores to be sold to foreigners are now included. 

The treatment of coal throws light on this paragraph. Coal 
taken for use on the voyage is registered, but not included 
among exports ; coal as cargo is included. 

Among exports not registered are cash taken privately and 
personal effects ; among imports not registered are smuggled 
goods, and cash and personal effects. 

For the causes and extent of the resulting differences between 
imports and exports, Sir R. Giffen*s two papers * on the subject 
should be consulted. 

* Essays in Finance^ Second Series ; ^n^ Journal of the Royal Statistical 
Society^ 1899. 
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TABULATION. 



Leaving now the consideration of blank forms of inquiry, let 
us turn to the methods by which our data, accumulated on these 
forms, can be tabulated. At first sight the tabulation of so many 
million census forms, so many schedules of wages, and so many 
lists of goods imported, seems mere office work, to be done 
mechanically,* only requiring accuracy and not subject to 
scientific analysis. Tabulation does, indeed, involve a great 
deal of automatic labour ; but the determination of the exact 
form of the table and the choice of the headings to which the 
totals shall correspond task the administrative statistician, and 
are worth the closest study. 

The function of tabulation in the general scheme of a statistical 
investigation is sufficiently definite ; it is to arrange in easily 
Tb0 AmotioB of accessible form the answers to those questions 
totaiaiion. ^j^h which the investigation is concerned. If it 
is required to know, for instance, the number of persons of each 
sex and age-group in all the districts of the country, the figures 
in the table must show these numbers. Or, to take a less definite 
problem, we want all the information possible as to labour dis- 
putes. In studying the forms issued by the Labour Department, 
we have seen that the information which can be obtained is not 
precisely that which we require. The problem then is so to 
tabulate our information that our totals may give answers as 
near to our requirements as possible, and it can easily be 
found by experiment that the way to do this is by no means 
obvious. 

Not only must the figures be grouped so as to answer the 
questions put forward in the original scheme, but if the in- 
formation is of wide and varied interest, as in all the inves- 

* An account of Mr Hollerith's electrical tabulating machine, used in the 
Xlth Census of the U.S.A., will be found in Dr Bertillon's Caurs Elhnentairey 
p. 579 seq. 
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tigations so far considered, the data must be studied from many 
points of view, and tabulated so that students in all branches of 
knowledge may be able to extract from our tables the infor- 
mation they require. Thus the population census is used by 
the financier, the legislator, the merchant, and the commercial 
traveller; political economists turn to it for light on the de- 
velopment of industry, and on the change of numbers in 
each trade; those interested in social questions will study 
the ages and sex-distribution in various districts or occu- 
pations ; the sociologist and biologist will need accurate infor- 
mation as to the growth of population and the change of age 
distribution. 

To take more specific points, the blue-book which con- 
tains the tabulation of foreign trade statistics will be ex- 
pected to show how our trade with each country is de- 
veloping, whether we are holding or improving position in 
certain markets ; whether we are exhausting our supply of raw 
materials ; whether some new commodity is yet of importance. 
It must be remembered that the original material is not 
accessible to the public, that they are dependent on the 
information extracted for them, and that, though it would be 
possible to turn through all the forms for special data, yet the 
laboi^r needed would be prohibitive, while a little more detail 
in the tabulation might easily have isolated the information 
needed. 

For convenience, the methods of tabulation may be divided 
into three groups : A. The simple statement of totals of persons 
Throe gronpB of or things which satisfy given conditions, such as 
toiraiAtioiii. the number living in a town, or the total value of 
imports from France; B. The grouping of a great number 
of units in relation to some particular property possessed 
by all — e,g,y the population according to ages, or wage- 
earners according to the value of their wages ; C. The tabu- 
lation of non-numerical answers in suitable groups to give 
a view of the whole — ^^., the causes of strikes or the state of 
employment 

In the tabulation the convenience of the reader must be 
studied. The table must be so arranged that any totals required 
can instantly be found. This is to a great extent a question of 
typography, the use of suitable founts for figures and headings, 
and also of the choice of the right shape and size of page. 
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Supposing the best possible choice made in these respects, our 
rule will then be to get the maximum amount of information into 
the minimum space. 



Group A. — Thus we can have single tabulation, answer- 
cnaisM of tiora- ing one or more groups of independent ques- 
tions, as : — 



latlon. 



Number and Membership of Trade Unions.* 



Year. 


Number of Trade 

Unions at end of 

Year. 


Toul Membership of these 
Unions at end of Year. 


1896 

1897 
1898 


1,317 
1,307 
1,267 


1,493.375 

1,611,384 
1,644,591 



Double tabulation shows the subdivision of a total according 
to two categories, in the following example according to sex and 
age:— 



Classification of Paupers in Ireland. — Total Numbers who 
received Relief during the Year ended Lady Day 1892. t 



Ages of Persons Relieved. 


Males. 


Females. 


Total. 


Under 16 years 

Of 16 and under 65 years 

Of 65 years and upwards 


44,391 
132,370 

35,121 


43,648 

79,045 
45,668 


88,039 

211,415 

80,789 


All ages • 


211,882 


168,861 


3801243 



• Compiled from the Sixth Annua/ Abstract of Labour Statistics^ p. i. 
t Ibid,<^ p. 102. 
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More information may be included thus : — 



Classification of Paupers in England and Wales. — Total 
Numbers who received Relief during the Year ended Lady Day 
1892.* 



Ages of Persons Relieved. 


Indoor. 


Outdoor. 


1 

Total. 


Metro* 
polls. 


Other Parts 
of England 
and Wales. 


Under 16 years - 

Of 16 and under 65 years 

Of 65 years and upwards 


111,782 
232,284 
114,144 


441,805 

385.299 
287,760 


558,587 
617,588 
410,904 

1 


100,671 
148,066 

64.779 


452,916 

469,517 
337.125 


All ages - 


458,210 


1,114,864 


x.S73,074 


313,516 


1,259,558 



A TREBLE tabulation can be used, subdividing the total into 
three distinct categories, with cross totals for each group. Thus 
the following table gives separate divisions according to age, 
sex, and district ; percentage lines, in a distinct type, are also 
introduced : — 



• /h'd,, p. 10 1. 
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The same process can be further extended : the example 
in the table opposite shows an arrangement for a QUADRUPLE 
tabulation, distribution by district, date, sex, and occupation, 
with subsidiary information ; but it is generally better to use 
two or more tables than to increase the complication, unless 
it is necessary to bring several categories into close relation. 
Suitable varieties of type will often make comparisons easy in 
a very complex table. 

Looking now at the census householders' schedule (p. 23), 
it will be seen that there are about twelve different items 
Tabnifttion of of information about each person : county, town, 
oensu material, parish, position in family, civil condition, sex, age, 
occupation, industrial position, infirmity, birthplace, and house- 
room. These could be tabulated in 66 different single, 220 
double, or 495 treble tabulations, so that there is plenty of 
scope for choice. 

To fix our Ideas, we will take occupation as the main sub- 
Mr Booth's division, and examine Mr Booth's use of the census 
toimiauon. returns, say for London Printers.* 
First he gives a treble classification — occupation, sex, and 
age — using columns 4, 5, and 6 of the schedule. 



Census Divisions, 1891. 


Fbmalbs. 


Males. 


Total. 


All Ages. 


-19. 


ao-54. 


55- 


1. Printer - 

2. Lithographer, &c. - 


1,316 
809 


9.988 
757 


21,784 
3,037 


1,921 

437 


36,009 
6,040 


Total - 


2,126 


10,746 


24,821 


2,368 


40,049 



Then follows a single table, district and numbers, using the 
information on the back of the schedule. 



Distribution. 



B. 


N. 


W. &C. 


s. 


Total. 


5,884 


9,835 


7.577 


16,753 


40,049 



L(/£ and Labour of the People^ vol. vi., p. 189. 



ND Female 




1 891. 



Numbers 
niployed. 



000s 
emitted. 



10,592 
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Three simple tables are then given, relating to heads of 
families, using columns 2 and 4 (sex), 2 and 10 (birthplace), 
and 2, and 7, 8, 9 (industrial status). 

His next table uses columns 2 and 6, and is as follows : — 



Total Population Concerned. 




Heads of 
Families. 


Others 
Occupied. 


Unoccupied. 


Servants. 


Total. 


Total - 


18,048 


16,060 


47,257 


854 


82,219 


Average in Family - 


I 


.89 


2.62 


.05 


4-56 



The next tcible (not here given) is a single classification 
according to number of rooms and servants, a most ingenious 
indirect use of the scheduled information ; and the last is an 
example of the legitimate use of a quadruple tabulation — 
occupation, industrial status, sex, and age — ^given on the next 
page. 
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It would be difficult to find a better example of tabulation 
of a great multitude of details to serve a special purpose. The 
The WDMOE census authorities had in many cases not tabulated 
totrautionB. the necessary details, and it was necessary to turn 
through the original schedules to get at the facts. For such 
work as this, the function of tabulation is simply to provide 
the answers to definite questions. Thus the census reports 
show how many persons pf each sex and age-group belong 
to certain industries in certain places, in a quadruple tabulation 
extending over many pages, each page relating to one district, 
and this table may be used for accomplishing many separate 
purposes : each item is already a total ready for use. It is 
impracticable from limits of time and space, even if it were 
desirable, to tabulate all the possible groups of qualities which 
can be made from the twelve statements on each census form ; 
a good tabulation will aim at providing only those statements 
which are of practical use. Thus many simply descriptive totals 
are given, such as the numbers of each sex and age in each parish 
in the United Kingdom, to serve primarily for administrative 
purposes ; and many statements which will afford the economist 
and sociologist the opportunity of tracing the progress of in- 
dustries, of studying the ages of workpeople in different occu- 
pations, the changes in age-grouping of the nation ; and some 
further tables might be given to throw light on problems of 
cause and effect, such as the average ages in town and country, 
the connection between infirmities and occupation, or the ages 
of marriage in various districts or industries. 

It is interesting to open one of these great tables of figures, 
such as are generally to be found forming the bulk of a blue- 

book, and taking a figure at random, ask " Why 
is this figure printed, what question does it answer, 
to whom can it give information ? " For instance, in the Eighth 
Report on Trade UnionSy p. 257, we find that the United 
Brickworkers' and Brick Wharf Labourers' Union spent il^20 
on funeral expenses in 1894, an average of 3s. 7jd. per member. 
As an isolated statement this may interest a very small number 
of persons ; but that small number has a right to expect 
that they shall find the figures relating to their union tabulated 
in a general official book ; to them it may be as important as 
the item, on the same page, of ;^S,48i spent by the Boiler- 
makers. From this point of view, the question of inclusion of 

F 
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such small items is simply one of space. If space is limited, 
a selection would be made of larger quantities only, as being 
'^ likely to concern more people. 

But there is a reason of quite another character for printing 
such items as these. The raw material, on which the totals 
importanoeof in such tables are based, is not accessible to the 
raw matenaL student except by means of this Report. Now, the 
compiler of these statistics cannot^ know from what particular 
point of view they will be studied. It may be desired to 
examine and group trade unions according to their expendi- 
ture on different items, to study their history, classifying them 
as fighting organisms and as friendly societies. The tabula- 
tions needed cannot well be foretold. The material is there- 
fore given in the rough, in order that the tabulation may be 
made by each student according to his needs. At the same 
time the most suggestive totals are given as one of these 
possible methods of tabulation ; and in the summary of such 
a report, the items are retabulated, the rough material being 
omitted, in those ways which the editor thinks most useful. 

When space is much too limited for any publication in 

extenso of the items, a careful selection must be made of those 

Beiaotion of to be printed ; and it is this selection that is 

raw matenaL generally open to most criticism. Owing to the 

great admiration for uniformity generally to be found in the 

official mind, valuable space is wasted on such statements as * — 



COVENTRY : 



I89I 



Shipwright: Ship, Barge, &c., Builder (Wood) 
»> >} >» »> (Iron) 



MALES. 



I 
O 



while all the males — masters, traders, skilled workmen, labourers, 
errand-boys — engaged in the cycle trade in Coventry are in- 
cluded in — 




In such cases, two useful rules might be applied : omit all 
numbers under, say, 500 when by so doing a line of print 
would be saved; and give all numbers over 10,000 correctly only 
to the nearest 100, and so for other digits in proportion, thereby 



Census Report. 
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reducing the width of columns of print If, for example, we 
knew to the nearest 100 the exact numbers in each district 

Eoonomyof ^"d occupation in which as many as 1,000 were 
•P"*- employed, our knowledge would be as com- 
plete as we needed ; and it is doubtful whether the space 
occupied by this tabulation would be more than that already 
devoted to the subject. In many cases, on the other hand, it 
is essential to have the raw material quite unchanged. Each 
tabulation must be judged on its own merits. 

It may be useful to take a particular group of answers, and 

discuss what tabulations will throw most light on the questions 

TaimiAftionoftho ^^ issue. The Poor Law Commissioners of 1833 

Poor Law collected information from a thousand villages in 
Betnrai, 1883. England and Wales on the following six points 
among others : the wages of an agricultural labourer in summer 
and in winter, both with and without the inclusion of beer as 
part payment, his annual earnings, and the subsidiary earnings 
of his wife and children. It may be supposed that the chief 
object of the Commissioners was to find whether the labourers' 
families earned enough for their support, and what proportion 
was earned by the wives and children. 

The following scheme of tabulation would show in what 
counties the labourer was badly off: — 



Coanty. 


Average Annual Earnings of 


Man. 


Family. 


Together. 




1 







The counties might be taken in alphabetical order for con- 
venience of reference, or in geographical order with subordinate 
averages for groups {e,g,y Eastern : Norfolk, Suffolk, Essex) ; or 
the counties might be arranged in the order of the total earn- 
ings, so that it could be seen at a glance in which counties the 
labourers were worst off. 

To show the number of villages, county by county, in which 
the earnings were below a certain minimum, or within certain 
limits, the following table might be used : — 
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Annual Earnings of Men and Families. 



Number of VUlages in 


which the Total Earnings averaged 


Average Earnings in 
of 


County 






11^ 
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1^ 


g. 
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/6 


J/i 


^/ 


/4 


/oj 


7J 


^ 


• • • 


In Suffolk - 


o 


3 


4 


5 


3 


2 


2 


;f28 


;fil 


;^39 


Percentages of 






















TotalNumber 






















of Villages - 


o 


/6 


2f 


^ 


/6 


loi 


/©i 


7^ 


28 


i • • 


In Essex 
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;f28 
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Percentages of 






















TotalNumber 
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In Eastern 
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^J 


S^ 
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Counties 
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7 


13 


i8 


^7 
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£^ 10 


;£^X0 10 


£Si 


Percentages of 






















TotalNumber 






















of Villages • 


/ 


lO 


^9 


^6 


^5 


/^ 


7 


7J 


27 


• • • 



This table can be used in the above complex form or simpli- 
fied. The number of subdivisions of money to be distinguished 
depends on the space at disposal and on the number of villages 
which would be entered in each. A table in which most of 
the entries are i or o is open to criticism. In the above table 
the villages are too few to allow accuracy in percentage. 

It will be seen that this table would furnish the answer to 
almost all questions which could be put as to total earnings. 
Tairaifttionto Fo^ instance, if we wish to see the relation between 
■iiowoorreiation. ^otal earnings and the family's subsidfeiry con- 
tribution, we should look at the smallest totals in the last 
column and see if they corresponded with the largest percentage 
of family earnings. If we found signs of correspondence we 
should re-arrange the counties in the order of these subsidiary 
percentages, and see if they were approximately in order of 
total earnings also. This is an example of tabulation to show 
correlation, the correspondence in the occurrence of two sets of 
phenomena. 

Another important group of questions arising in connection 
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with these tables is : What is the relation between weekly wages 
Wageiand ^^^ annual earnings, and what proportion of the 
••™*°«"- wage is generally paid in kind? We shall not 
now require the statements as to subsidiary family earnings. In 
records of agricultural wages the most common statement is, ^.^., 
"wages in this district are from los. to I2s. a week." Now, a 
farm labourer does not generally earn as much in winter as in 
summer, because wages are reduced to correspond to the smaller 
amount of work necessitated by failing light ; from this cause 
annual earnings will be less than the weekly wage multiplied by 52. 
Besides this wage he generally receives special money at hay and 
wheat harvests,and also many payments in kind, such as daily beer, 
house and ground at reduced rent, and other privileges. It is 
generally best to value all these, and compute his earnings thus: — 

I OS. for 38 weeks - j£^i9 



i2s. for weeks 9 (summer) 
Hay harvest, i week 
Wheat harvest, 4 weeks - 
Beer, is. per week - 
Cottage and ground 
Other perquisites 



5 
o 

5 
2 

5 
I 



o 
8 

o 

12 

o 

5 



o 
o 
o 
o 
o 
o 
o 



^39 o 0= 15s. per week. 

In this case earnings are 50 per cent, above the general 
weekly wage. An estimate of this nature has been made by the 
late Mr Little for each county for 1867-70 and 1892. We can 
tabulate the figures for 1833 in the same way for comparison, 
in geographical order and with the county as unit. We must 

first consider the question. Has beer been at all 
generally replaced by money? We can tabulate 
the figures as follows to answer this : — 



B«0r. 



1833. 


1893. 


z. 
County. 


2. 
Average 
Summer 

and 
Winter 
Weekly 
Wages. 


3- 

Average 
Earnings 
per Week. 


4- 
Difference 


5. 

Number 

of Villages 

where 

Beer is 

given. 


6. 

Propor- 
tion to 
Total. 


7- 

Difference 

between 

Wage and 

Earnings. 


8. 

Excess of 

4 over 

7- 



















86 



ELEMENTS OF STATISTICS. 



In column 2 should be given the county average of the 
wages stated, without making any cash allowance when beer is 
given. Then if money has been replacing beer, we should find 
that in those counties where beer was most often given, wages 
had risen relatively to earnings more rapidly than in the 
counties where free beer was rare. Columns 4 and 7 show the 
differences for the two dates. When the entry in column 4 is 
greater than the corresponding entry in column 7, kind has 
been replaced by money. These excesses would be given in 
column 8. If money has replaced beer, the counties which 
have the greatest entries in column 6 should also figure high in 
column 8, and vice versa. 

The question. Are winter wages generally below summer 
Winter and wages, and by how much? can be answered by the 
lummorwagei. following scheme of tabulation, which uses the data 
not employed in the previous tables : — 



COUNTIKS. 


Average Weekly 
Wage in 


Number of Villages where the Excess of 
Summer Wages over Winter was 




Summer. 


Winter. 


Nothing. 


6d. 


IS. 


IS. 6d. 


as. 


More 
than as. 




f. d. 


s. d. 














Norfolk - 


II 2 


10 3 


13 


2 


3 


2 


5 


3 


Percentage of Number of Villages 
included 


46 


7 



II 


7 

I 


i8 

2 


II 


Suffolk - 


10 2 


9 8 


24 


6 


I 


Percentage of Number of Villages 
included 


70 






18 


3 



6 


3 
4 


Essex 


10 9 


9 10 


22 


11 


5 


Percentage of Number of Villages 
included 


52 




2 


26 



3 


J2 


10 
8 


Eastern Counties 


10 6 


9 " 


59 


20 


12 


Percentage of Number of Villages 
included 


57 


2 


79 


3 


u 






These examples do not quite exhaust the useful tabulations 
of these groups of figures, for we have not yet examined the 
distribution of wages, that is the relative numbers paid at 
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different rates. These returns do not, however, illustrate such a 
tabulation well, for we are not told the rates paid to individuals, 
but only the rate prevalent in the villages. 

Group B. — The grouping according to wages affords an 
example of the second method of tabulation. We have now 
no definite questions to answer, as in the method so far discussed, 
but a more general problem : given a mass of data, it is re- 
quired to tabulate it, so as to present the maximum amount of 
useful information. Our raw material is so many thousand 
isolated statements, which must be focussed, made to present 
definite meaning, and worked up so as to be useful for future 
comparison. 

Some investigations are undertaken not to answer any de- 
finite questions or to throw light on any given problem, but to 

sutisti WI1086 ^ol'^^^ information which, though it has no imme- 
pnrpoM Is not diate use, is likely to be needed ultimately by many 
daflnita. investigators occupied with various questions. Such 
is a wage census. So long as we have no sufficient account of 
wages, we are badly informed as to one of the most important 
measurements of the social body, and economists and statisticians 
are continually hindered by the want of data essential for their 
work ; but the census has no immediate practical use, for knowing 
the height of wages does not help us directly to regulate that 
height. In such an investigation our object will be to examine 
the figures, and give all the groupings and averages which seem 
likely to be useful for any purpose ; and while doing this we 
shall imperceptibly pass to a different class of investigation ; 
we shall be finding a structure underlying our multifarious 
details ; we shall find that the chaos, which our figures present 
at first sight, obeys laws ; we shall be making a visible outline, 
and giving a definite shape to our apparently featureless mass. 

The complete discussion of this problem belongs to a later 
chapter ; but the tabulation can be begun without special 
technique. The examples taken will relate chiefly to wages, 
but the methods are quite general. 

In the American Report on Wholesale Prices^ Wages and 
Transportation of 1 89 1, the wages of some 10,000 persons are 
detailed. It is proposed to consider their tabulation as a homo- 
seieouon of limito geneous group. The results are given on pp. 91-2. 

ofgronpi. Ij^ ^jjg original publication the wages are given 

to half a cent ; in the second column, on p. 91, the numbers of 
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wage-earners are given in lo-cent groups, from $.25 to $.34, $.35 
to $.44, and so on, those earning wages exactly at the dividing 
points being always placed in the division below. Notice that 
the average wage of such a group as $2.15 to $2.24 is not $2.20 
if the wage-earners are evenly distributed cent by cent, but the 
average of $2.15, $2.16, . . . $2.24, />., $2,195. 

Looking at column 2, it will be seen that the figures present 
no order, follow no rule ; no structure has yet been found, our 
divisions are too narrow for our material. 

Now group the wage-earners with wider limits, as in column 6, 
where the numbers earning in half-dollar groups are given ; we 
have here a nearly regular sequence of numbers falling after the 
maximum in the second group. Going back to narrower limits, 
to find exactly at what divisions this regularity is first in evidence, 
we have in column 4 the numbers in 20-cent groups which show 
considerable, but not absolute regularity. The numbers in 
30-cent groups* are successively 75, 355, 674, 1,242, 740, 660, 
343> 310, 180, 181, 233, 32, 82, 3. 4, 8, I, almost completely 
regular except for the large group at $3.50. 

The question as to which of these groupings should be selected 
is to be decided by the number of separate items the eye can 
instantaneously grasp. In looking at the 25 numbers in the 
20-cent groups, or the 18 in the 30-cent, the meaning is lost in 
a maze of figures (though as many details as these could be 
properly shown in a diagram), but the 1 1 numbers in the half- 
dollar groups are easily comprehended. 

Stated in words, the result of our tabulation (column 7) is 
that 6 per cent, of the wage-earners made from $.25 to $.74, 
28 per cent, from $.75 to $1.24, and so on. 

For the practical work of the tabulation from the original 
figures, we should take ruled sheets, enter at the head of successive 
Praoticai UTni- columns Certain wage limits, and turning through 
lauon. the items enter each wage by a dot in its appro- 

priate column, grouping them in fives and tens, to facilitate 
addition. 

From the preceding paragraphs it is clear that we do not 
need to take separate columns for each cent from $.25 to $5.35 
for tabulation, but a little consideration is necessary to see how 
minute the limits should be to give the correct average. 



* Videp, 121 in/ra. 
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Suppose the entries in cent groups to be :- 



$1.70 


$1.71 


$1.72 


$1-73 


$1.74 


• • 
• 

> • • 

1 

; • 


• • 
• 

• • 

• • 


• 
• 
• 


• • 
• 

• • 

• • 
• 

• • 


• • 
• 

• • 

• 



The average of the wages so entered can be quickly calculated 
as $1,718. 

If, on the other hand, we put all the 46 entries as simply be- 
tween $1.70 and $1.74, or more exactly as much as $1.70 but less 
than $1.75, we should naturally take them to be all (for purposes 
of averaging) at the middle point of this group, viz., $1.72. 

If we have a sufficient number of items, the differences 
between the average assumed and that calculated for each group 
will be very slight. This is seen on p. 91 ; column 8 gives the 
averages calculated from the entries in lo-cent groups, while 
column 9 gives them on the hypothesis that for purposes of 
averaging the numbers in the half-dollar groups are all at the 
middle points of their groups. The difference is greatest in the 
first and last, the smallest groups. The general average obtained 
from column 9 is $1.70, which is the nearest round number to the 
true average $1.73. Hence, for the purpose of obtaining the 
general grouping and average, we need only take 1 1 half-dollar 
columns for marking in our items. 

For other purposes it may be advisable to work more minutely ; 
for in the lowest group, we shall wish to know how many are 
earning $.25, $.30, $.35 separately, for 5 cents is a perceptible 
difference on 25 cents. At the top also it may be useful to know 
the exact wages. 

More minute entries again will be needed for the second 

method of tabulation, which is as follows : — Suppose all the 

Tha Gaitonio wage-earners to be arranged in order of the magni- 

method. ^ude of their wages, those at $.25 at one end, those 
at $5.75 at the other. Note the wages of men at given points in 
the row. The lowest wage is $.25; one-tenth of the way along, 
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that of the 51 2th worker is between $.85 and $.95, . . . : half- 
way up the wage is $1.50. The figures at each tenth are given 
on p. 92. By this means we get a Very vivid idea of the distri- 
bution according to wages. 

These numbers cannot be obtained accurately if we have only 
entered the details correct to half-dollars, but can be found from 
the lo-cent grouping, which is therefore the classification to 
be adopted.. We must first determine in which of the small 
groups the men one-tenth, two-tenths ... up the group lie, 
and then estimate their position inside the smaller group. 
Thus, if we want the figure more accurately than "between 
$.85 and $.95," as given above, we proceed as follows: — The 5 12th 
man from the bottom is the 82nd man in the group between 
$.85 and $.95, for there are 430 earning less than $.85 ; this group 
contains 169; if they were distributed regularly, 17 to each 
cent, the 82nd man would be half-way through this group, 
between $.89 and $.90. The hypothesis of even distribution is 
sufficiently correct for most purposes, and this method affords 
a sufficiently accurate means of determining the wage of the 
workers at the tenth places. The resulting figures are given on 
p. 92. If, however, we want to know the wage of the half-way 
man more exactly, we see from the half-dollar groups that it is 
between $1.25 and $1.75, a rough approximation shows it to lie 
probably between $1.45 and $1.55, and then we rapidly turn 
through our original data, isolating the wages at $1.46, $1.47, 

. . . $1.55* 

A slight modification of this method Is also useful. Take the 
average of the lowest 512 (or tenth), namely, $.70^ ; of the next, 
namely, $1.03 ; and so on (see p. 92). These figures also give a 
vivid view, and are very convenient for comparisons with other 
groups. 

The figures so far apply to only half of the data in the 
Senate Report. On p. 92 the whole are tabulated to give the 
average wages of the successive tenths. A comparison of the 
two groups so obtained shows how far the first half was typical 
of the whole. This method will be dealt with in a later chapter. 

♦ On this method see pages 127, 128. 
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Tabulation of Wages — American Figures, 1891. 





z. 


3. 1 


Earning Daily 


No. of 


Wages. 


Persons. 




$ 




as much and less 




as 


than 




.25 


35^ 


■5! 


.35 


.45 


.45 


'^H 


§) 


•|5 


.65 


.65 


.75^ 


1571 


•75 


.8s> 


"3/ 


' .85 


.95 


169 1 


.95 


1.05 - 


201/ 


1.05 


1.15 


304 I 


1.15 


1.25-' 


685/ 


1.25 


1.35 \ 


991 


1-35 


1.45 1 


458/ 


1.45 


1-55 > 


466 \ 
72 1 


».55 


1.65 


1.65 


1.75^ 


202 \ 


1.75 


i.85\ 


3291 


1.85 


1.95 


58 1 


1.95 


2.05 - 


2731 


2.05 


2.15 


451 


2.15 


2.25; 


265/ 


2.25 


2.35 ^ 


33 I 


2.35 


2.45 


101/ 


2.45 


2.55 ^ 


196I 


2.55 


26.5 


»3j 


2.65 


2.75/ 


1631 


2.75 


2.85^ 


2/ 


2.85 


2.95 


'5\ 


2.95 


3-05 ■ 


129/ 


3.05 


3-15 


51 


3.15 


325^ 


47/ 


3.25 


3-35\ 


I2\ 


3.35 


3-45 


j 


3.45 


3.55 


221 \ 
5 1 


3.55 


3.65 


3.65 


3-75 


16 \ 


H^ 


3.85^ 


11 / 


3.85 


3-95 


o°l 


3.95 


4.05 . 


82 1 


4.05 


4.15 


°t 


4.15 


4.25 


3/ 


4.25 


4-35> 


"l 


4.35 


4-45 


°\ 


4.45 


4-55 


A 


4.55 


4.65 


If 


4.65 


4-75^ 


o\ 


4-75 


4.85^ 


0/ 


4.85 


4.95 




4.95 


5-05 - 


8/ 


5.05 


5.15 


°l 


5.15 


5.25>' 


0/ 


5.25 5.35 
Totals ^ 


I 


^.123 


Avenge Wage 


$1,731 



$ 



as much and less 
as than 



.2 

•4 
.6 

.8 

i.o 

1.2 

1-4 
I 1.6 

1.8 

2.0 

2.2 

2.4 
2.6 
2.8 

3-0 
3-2 

3.4 

3.6 

3.8 

4.0 

4.2 

4.4 

4.6 

4.8 

5.0 
5.2 



.4 

.6 

.8 

1.0 

1.2 

1.4 
1.6 

1.8 
2.0 
2.2 

2.4 
2.6 
2.8 

3.0 
3-2 

3-4 
3.6 
3.8 
4.0 
4.2 

4.4 
4.6 

4.8 

5-0 

5.2 
5.3 



No. of 
Persons. 



16 
144 
270 
370 
989 

557 
538 
531 
331 
310 

134 
209 

165 

144 

52 
12 

226 

27 

82 

3 

o 

4 
o 

8 

o 
I 

5»»23 



5- 


0. 


7. 




No. of 
Persons. 


Percent 
age. 


$ 
as much and less 
as than 






•25 .75 


317 


6.2 


.75 1.25 


1,472 


28.7 


I.2S I.7S 


1,297 


25.3 


1.75 2.25 


970 


18.9 


2.25 2.75 


506 


9-9 


2.75 3.25 


198 


3.9 


3-25 3-75 


254 


5.0 


375 4.25 


96 


1.9 


4.25 4.75 


4 





4.75' 5.25 


8 


.2 


At 5.35 


I 




5,123 


100 
Avera 



8. 
Average 
Wagem 
Group. 



9. 



$ instead $ 
.62 of .50 



1.09 



1.49 



1-99 



3.04 



5.CX) 
5.35 



I.CX) 



1.50 



2.00 



2.53 2.50 



3.00 



3.51 350 



4.00 4.00 



4.50 4.50 



5.00 
5-25 
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Wages of "Tenth' 


' Men ifUciUs), 


Lowest Wage - 




• $'3o 


Ath up Group 




.89 


Ath 




1. 12 


Ath 




1.22 


Ath 




■ 1.39 


Ath 




X.49 


Ath 
Ath 




■ ^'75 
1.99 


Ath 




• 236 


Ath 




2.98 


Highest - 




• 5-35 







Same for 


ATenige Wage of 


xo.ooo 






Workers. 


Lowest tenth • 


■ $.70 


.79 


Second ,, 


1.03 


I.OO 


Third „ ■ 


1. 18 


1.24 


Fourth „ 


- 1.28 


1.50 


Fifth „ . 


• 1-44 


1.50 


Sixth „ . 


• i!86 


1.^8 


Seventh „ 


2.00 


Eighth „ • 


' 2.14 


2.22 


Ninth „ . 


2.59 


2.58 


Highest „ - 3.51 
General Average 1.731 


3.55 


1.82 



The tabulation of the data collected for the Wage CENSUS 
on such forms as that on p. 36, illustrates well some of the 
difficulties involved. The items given on the main part of the 
schedule are of this kind : — 



No. AveraKe Wage. 

Spinners — Time: 6 : 12s. : 5 6| hours. 

Such returns are not perfectly definite, for if many are 
employed in the same occupation in a mill, it is possible that 
Tainiiatioii in they will earn at different rates. Thus this entry 
the wage osnnu. of 6 at I2S. might arise from either 6 men each 
earning 12s., or 2 at los., 2 at 12s., 2 at 14s. (average 12s.); 
or 4 at I2S., I at iss., i at lis. ; or 5 at 12s. and i at i8s. — 12s. 
being the general rate, but not the average, in these last two 
alternatives. Since the purpose of the wage census was to 
give a comprehensive account of wages adapted for use in all 
investigations, it should show the numbers in all trades and 
subdivisions of employment by age, sex, and district, the average 
and general rate of pay for each group, and sufficient details to 
show the distribution about the average in each group, for a 
mere average may conceal exceptionally high or exceptionally 
low wages. 

On inquiry at the Labour Department as to whether the 
original information had been given in a more detailed form than 
the line above, or whether divergencies might be concealed, the 
author learnt that the subdivision of occupations had been carried 
to such an extent, that in practice, where there was any great 
variation in the wages of workers under one heading, that head- 
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ing had been split up, so that each group was separately entered, 
or that several groups were distinguished under one heading ; and 
that when there was reason to believe from the light of other 
retams that this had not been done, supplementary inquiries 
were made on this point, so that the original data were detailed 
enough for any requisite fineness of tabulation. 

The problem then was to tabulate the answers from the 
various factories in a district, to show clearly and succinctly 
the distribution of wages in each subdivision and in the whole, 
can hardly be said with confidence that the method adopted, of 
which a specimen is given on p. 94, is entirely satisfactory. 

To clear our ideas let us suppose that the details on which 
the line relating to throwsters (time) was based were as 
follows : — 



" average minimum rate." 



3 earning 


14/ 


14 » 


15/ 


6 „ 


15/6 


20 » 


16/ 


10 » 


17/6 


20 „ 


18/ 


8 „ 


18/6 


10 „ 


19/ 


»o » 


20/6 


8 „ 


"Is 



68 within 10 per cent, of the average 
for all, which is 17/7. 



8 earning 20/11 on the average. 



The process adopted in the tabulation may be supposed to 

have been to separate from the whole group of returns a small 

vuioiis method! group of old men or inferior workers earning far 

PM«iw«- below the average, and enter them as a distinct 
minimum group, and to separate a small group of the most 
skilled workers and enter them as a maximum group. This 
is better than giving simply the highest and lowest of the 
individual wages, for either of these may be due to excep- 
tional circumstances, and may be quite a long way from that 
paid to any other person. The exact size of these extreme 
groups must be determined from inspection of the returns them- 
selves. After this has been done, the remaining wages may not 
be grouped close together ; in the example taken they are 
scattered between 15s. and 19s. To give some clue as to this 
distribution the number earning within 10 per cent, of the 
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average is stated ; this is probably the best way if only one 
column can be devoted to it, but lo per cent is a wide limit 
to adopt. Another method would be to give the limits within 
which the wages of the lo per cent, of the earners above and 
ID per cent below the average were contained : in this case i6s. 
and 1 8s. 

If, however, not more than 8 columns are to be devoted to 
each group, the following arrangement would give much more 
definite information, and it could have been made from the data 
in hand, and would be well adapted for all the purposes for 
which it would be required. 

Number employed - - 109 

General average - - - - 17/7 

Average of lowest tenth * - - 14/9 

Quartilet - - - - 16/ 

Mediant - - - - 18/ 

Quartilet - • - - 19/ 

Average of highest tenth * - - 21/2 

We are fortunately not dependent solely on the tabulation 
The gnieni as given above, for wages in industries as a whole 

■'™"'*^* are also tabulated on the following plan, which is 
in a form most useful for purposes of comparison (p. 96). 

The lines giving percentages are most useful. We can at a 
glance compare the levels of wages in different industries. Thus 
in the cotton manufacture the average wage is 2s. higher than in 
the woollen ; and in the cotton there is a large group of highly 
skilled workers earning from 30s. to 35s., while in the woollen 
nearly half are close to the average, earning between 20s. and 
25s. In the jute and linen manufactures the averages are nearly 
the same, but in the former a larger proportion are below the 
15s. limit. In the silk manufacture there is an aristocracy as in 
the cotton, but it is smaller and better paid, for 12 per cent, 
earn more than 35s. This table is a masterpiece of concentration 
and clearness. 

We will discuss next the tabulation of the figures relating 

* Vide p. 92, t Vide p. 124. 
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to CHANGES in RATES of WAGES collected by the Labour 

Tairoiation of I^cpartment Specimens of the forms by which 

<duuDge of such information is obtained were given among 

wages nturns. ^^Qge relating to strikes (p. 57). Referring to 

them, it will be seen that the facts given are the occupations 

and numbers affected, the dates from which the changes took 

place, and the wages and hours in a full week exclusive of 

overtime (a definition corresponding exactly to that used for the 

wage census) before and after the change. 



Extract from Table showing the Changes in Rates of Wages and 
Hours of Labour of Ordinary Agricultural Labourers in Various 
Districts of the United Kingdom in 1894, so far as reported to the 
Board of Trade.* 



County and Union. 


Particulars of Changes in 

Summer Wages. (1894 com* 

pared with 1893.) 


Particulars of Changes in 

Winter Wages. (1894 com* 

pared with 1893.) 


No. of Male 

Agricultural 

Labourers, 

Farm Servants, 
Shepherds, 

Horsekeepers, 
Horsemen, 
Teamsters, 

Carters, in '9 1. 


Increase. 


Decrease. 


Increase. 


Decrease. 


Lincolnshire— 
Gainsborough • 
Louth 
Spilsby - 

Norfolk— 
Aylsham - 
Docking • 
Flegg, East and 
West . - 
Forehoe • 


• •• 
■ ■ ■ 

• • • 

• « • 

• • • 

• • • 


Per Week. 

• • • 

• « a 

• •• 

1/(12/ to If/) 
6d.(i2/6tai2/) 

1/(12/ to 11/) 

• • • 


Per Week. 

a • • 

• • • 

• • ■ 

1/(10/. 11/) 

• ■ • 
V • ■ 


Per Week. 

i/6(i5/toi3/6) 
i/6(i3/6toi2/) 
i/6(i3/6toi2/) 

• • ■ 

• • • 

1/(11/ to 10/) 
1/(11/ to 10/) 


2,466 

3.932 
3.288 

2,576 
2,487 

1,108 
1,448 



* From the second Annual Report on Changes of Wages ^ pp. 198-9 ; a 
little compressed. 
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Extracts from Table showing the Changes in Rates of Wages of 
Ordinary Agricultural Labourers in Various Districts of the United 
Kingdom in the Summer of 1895, so far as reported to the Board 
of Trade * 



County and Union. 


No. of Male 

Agricultural 

Labourers, Farm 

Servanu, 

Shepherds, 

Horsekeepers, 


Particulars of 
Chanees in Sum- 
mer Wages (1895 

compared with 
1894). 


Weekly Ra 
inSuj 


teofW 

nmer. 


ages 






Horsemen, 
Teamsters, 


Decrease* in 


1894. 


1895. 






Carters, in 1891. 


• »i**«b^* 
















Per Week. 


s. 


d. 


«. 


d. 




Durham— 
















Stockton* - 


437 


Decrectse ofdd. 


17 


6 


17 







Teesdale 


669t 


Advance of 6d. 


17 


6 


18 







•(Barnard Castle 
















Rural Dist.).* 
















Oxfordshire— 
















Headington - 


1,118 


Decrease of is. 


12 





II 







Henley 


i.587t 


Decrease of is. 


12 


to 


II 


to 




(Hambleden Rural 






14 





13 







Dist., Bucks). 
















Norfolk— 
















Flegg, East & West 


i,io{; 


Decrease of is. 


II 





10 







Forchoe 


1,448 


Decrease of is. 


II 





10 







Henstead - 


1,504 


Decrease of is. 


II 





10 







Mitford and Laun- 
















ditch 


3.622 


Decrease of is. 


II 





10 







Smallburgh - 


2,264:^ 


Decrease of is. 


II 





10 







Swaffham - 


1,942 


Decrease of IS. 


II 





10 







Wayland 


1,535 


Decrease of is. 
Labourers with- 


II 





10 







Carnarvonshire— 




out food, ad- 


-19 





20 







Carnarvon - 


I,I24t 


vance of IS. 












(Gwyr£u Rural 


« 


Labourers with 


< 










Dist.). 


\ 


food, advance 

of IS. 


-II 

4 





12 








* Agricultural labourers in this district are hired in March and April for a year 
certain, and the change noted applies to the whole year, and not to the summer only. 

t The number of agricultural labourers, &c., is for the Poor Law Union, but the 
change applies to the Rural District only. 

% This number is partly estimated. 



* From the third Annual Report on Changes of Wages^ pp. ii8, 119, 121 
(typography adapted). 
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The adjoining tables give examples of the way in which the 

changes in agricultural wages were tabulated in the Second and 

AffrKraituna ^^'^^ Report on Changes in Rates of Wages and 

wages :oiuage Hours of Labour. In the first tabid space is 

in tairoiati<m. ^^sted by devoting separate columns to increases 

and decreases, with the intention of making the table distinct ; 

while it is not clear whether "Winter 1894" means the winter 

beginning in or that ending in that year. 

In the second table, which refers to summer wages only, the 
columns are rearranged ; and increases and decreases printed in 
the same column, the latter in italics. In the Fifth Report all 
the information is printed in a clearer way, thus : — 







Winter Wages.* 






District. 


Number. 


Weekly Rates. 


IncreaM or Decrease per 
Week in 1897. 


Tendring 


3»"3 


Jan. '96. 1 Jan. '97. 

8. d. s, d. 
10 II 


Increase. 

s. d. 

I 


Decrease. 

• • • 



The tabulation is repeated for the summer. 

The weakness in these agricultural returns is in the numbers 

column. In the returns from other industries the numbers given 

The number ^^^ thosc actually affected, but in this case it is not 

affeoted. found possible to obtain this number correctly, and 
the number entered is that found under " agricultural labourers " 
in the 1891 census, which includes the various categories as given 
in the above table. When a change of wages takes place in a 
rural district, we may perhaps assume that it is likely to be 
general, though if it was a reduction, it might not be made 
by the better employers ; and though the change will not 
take place in the same week throughout the district, there 
is not likely to be much variation in this respect. The 
change is generally made at the time that winter wages 
give place to summer, or summer to winter ; and a slight 
increase or decrease may take place by making the winter 
reduction or the summer advance later than usual. On the 
whole, little error will be introduced by assuming that the change 
stated affects all the adult agricultural labourers in the district, 

* From the fifth Annual Report on Changes of Wages, p. 145, 
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and it is quite probable that a proportional change* will take 
place in the wages of horsekeepers, shepherds, and others, 
though it may not in the case of boys, or old men who are 
earning less than the district rate. The question, " Approximate 
number of able-bodied labourers in parish?" is asked on the 
inquiry form, but as the answers are not used, it may be 
assumed that they are generally not given with sufficient 
exactness. 

The object of the whole tabulation is to show the change in 
the national weekly wages bill, but many details are lacking for 
the complete calculation. In the case of agricultural labourers, 
we need, in addition to these data, accurate statements of the 

change of additional earnings, special payments, 
and payment in kinds. In all cases we need a 
more complete account of the whole wage-bill as well as the 
change. For agricultural labourers the material has just been 
published by the Labour Department ; * every year it receives 
returns from most of the 600 unions as to wages at all seasons, 
whether there has been a change or not. 

The looseness in the returns as to numbers does not prevent 
our calculating the change in the county or country rates, for 

GhiAges Ai ^l^c numbers in each district affected by the change 
oountyratM. maybe expected to bear the same proportion to 
the numbers given in the census returns, as the number of agri- 
cultural labourers of the same class in the whole county or 
country does to the census number. 

The calculation for Durham in the above table for the 
changes in summer wages 1894-95 "^^ly be performed as 
follows : — 





Average before 
change. 


Change. 


Proportional 
number affected. 


Amount of change 
on wage-bilL 


Stockton - 
Teesdale - 


s. d, 
17 6 
17 6 


-6d. 
+ 6d. 


4 
7 


s. d, 

-2 

+ 3 6 



Total change in county, + is. 6d. 
Proportional number in county, 73. 

EfTect on county average, ^ — M, 

73 

Here, for simplicity of calculation, the numbers affected are 

''*' On these points see Mr Wilson Fox's Report on IVages and Earnings 
of Agricultural Labourers^ 1900^ p. 50, and pp. 1 11- 157. 
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taken to the nearest lOO, a process which is not likely to affect 
the average perceptibly.* This rough method is likely to give 
the result as accurately as the original data make possible. A 
similar process with suitable modifications can be applied to the 
changes tabulated for other industries. The summary of such 
returns for agriculture for all counties is as follows : — 

Comparison of the Net Effect of the Changes of Cash Wages 
per Week paid in the Years 1896 and 1895 in certain Districts 
in England and Wales.f 



District. 


WaGBS in 1896 AS COMPARXD 
WITH 1895. 


Wagks in 1895 as compabbd 

WITH 1894. 


Total** 
Number. 


Net Effect of Changes 

on Weekly Wages. 

Increase (+) and 

Decrease (-). 


Toul" 
Number. 


Net Effect of Changes 

on Weekly Wages. 

Increase (+) and 

I>ecrease(-X 


Total. 


Per Head. 


TotaL 


Per Head. 


England— 
Northern Counties - 

Yorkshire, Lanca- 
shire, and Cheshire 

Eastern and Midland 
Counties 

Sottthem and Wes- 
tern Counties 

Wales 


S.662 

a.897 

69,869 

20,901 

• • • 


£ 
-43 

+ J00 

+ 666 
-340 

• • • 


s, d, 

-0 li 

• 
+0 8i 

+ 2j 
-0 4 

• 
* • • 


3,766 

3,942 

89,576 

20,441 

2,165 


£ 

+ 44 

-126 

-2,045 

-575 
+ 73 


d, 

+ 2J 

-7f 

-5i 
-6f 

+8J 


Total - 


99,329 


+ 383 


+ I 


119,890 


- 2,629 


-51 



** The number given is the total of male agricultural labourers, farm servants, shepherds, horse* 
keepers, in 1891, in the Poor Law Unions in which the changes took place. 



* The corresponding calculations for Oxfordshire are : — 
12/ -1/ II 

13/ -1/ 16 



-11/ 
-16/ 

-27/ 



Effect on county average, -/'- = - 2d. 

161 

For Norfolk :— 

12/ -1/ 134 -134/ 

Effect on county average, ~ i?^ = - 4d. 

425 

t From the fourth Annual Report on Changes of IVages, p. xliv. 
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The value of this table is not obvious. It seems of little 

importance to know how many persons were affected altogether ; 

oritioigm of though it is of some value to learn from a previous 

sunniuy table, ^^^i^ ^^^t 58,578 persons received increases, and 

40,751 decreases in 1896. This total of persons affected is con- 
stantly given in these tables ; if a person receives an increase of is. 
one month, and loses it the next, he is counted as 2, and his con- 
tribution to the next column (net effect of change) is zero. This 
-£43 may mean that 2,000 persons received a decrease of is. each, 
and the remaining 3,662 (same or different persons) an increase 
of 3fd. each, or any other figures which would give the same 
total. The change per head in the next column is unimportant ; 
it only shows an arithmetical quotient with no concrete meaning 
that can be expressed in words. If it was replaced by another 

quotient, viz., ^^, where n is the number of agricultural 

labourers in the Northern Counties, we should know the effect 
on average wages. In fact, the table would be more useful 
thus: — 



Approximate Effect of Changes on National Weekly 

* Wage Bill. 



District. 


Increasbs. 


Dbcrbasbs. 

• 


Net 
Change. 


Total No.' 
Employed. 


Average 

Change. 


No. 
affected. 


Total. 


No. 
affected. 


Total. 



















The figures given supply an example of the common practice 
of carrying out into detail a calculation which depends originally 
on incorrect numbers, in this case the number employed, and is 
therefore misleading throughout. Till the average (useless here 
in any case) is taken, the error in this quantity has no injurious 
effect. As shown above, the average here given could be replaced 
by another which would be of use, and which would be correct 
within limits that could be defined, and would be narrow enough 
for most purposes. 

Further, since the column of numbers affected is admittedly 
wrong, the figures should be given to the nearest 1,000 rather 
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than to units, even if no attempt was made to estimate the new 
figure ; " between 5,000 and 6,000 are affected " is a more useful 
and correct statement than "5,662 persons belonged in 1891 to 
a class in some undefined way connected with that in question 
in 1896." 

The discussion of Group C, the tabulation of non-numerical 
answers, must be postponed till we have analysed the nature and 
use of averages. 
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CHAPTER V. 

AVERAGES, 

It is natural) in a book with the present title, to allot a 
considerable space to averages. By the use of averages complex 
groups and large numbers are presented in a few significant 
words or figures ; and thus the two definitions of statistics, 
the Science of Averages and the Science of Large Numbers , are 
reconciled. 

Some writers have attempted to draw a distinction between 
averages and meanSy but no general agreement has been reached 
ATtrmsMaiid ^s to the exact senses in which the words are 
"*••"• to be separately applied.* The best distinction 
may be made by deciding that an average is a purely arithmetical 
conception, such as the average length of life in a varied popu- 
lation, which does not correspond to any particular group, but 
IS only a short way of expressing an arithmetical result ; while 
the word " mean " is to be applied to some objective quantity, 
such as the mean height of Englishmen, about which all height- 
measurements are grouped according to a definite law, which 
will be discussed in the sequeLt 

A. Arithmetic Averages. — We may rapidly pass by 
some of the common uses of the word "average," and pick 
out those which will prove of use in statistics. An average is 
sometimes used merely to save big figures. The average weight 
of the University crew is given, only because it is more usual 
to speak of a man's weight being 12 J stone than of eight men's 
weight being 12 J cwt, and it is easier to connect the former 
with men's weight in general. Similarly, if we are comparing 
the value of the exportations of some commodity in two periods 

• Compare the article " Moyenne,'' by Dr Bertillon, in Dictionaire 
encyclopidique des Sciences Midicales^ with this chapter. See also the 
paper by Dr Venn in the Statistical Journal^ 1891, and chap, xviii. in his 
Logic 0/ Chance, 

t See Part II., Section i, infra. 
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of ten years each, we should say that the yearly average in the 
period 1870-79 was ;^ 1 0,000,000, and in 1880-89 was £i 1,000,000, 
rather than that the totals were ;^ 100,000,000 and £1 10,000,000. 
This leads to the second ordinary use of the word. If we 
The oommon were comparing the ten years 1870-79 with the 
denominator, eleven years 1 880-90, and the totals in the periods 
were ;^ioo,ooo,ooo and :£" 132,000,000 respectively, we should 
obtain no grasp of the difference till we had reduced them 
to a common denominator by dividing by the number of 
years, and found that the averages in the two periods were 
;^ 10,000,000 and ;^ 12,000,000. This class of averages is well 
known in cricket ; sometimes the total number of runs made 
or wickets taken by each cricketer are stated also, but these 
are rather as so-called statistical curiosities than as having 
much bearing on the skill or luck of the players. The numbers 
by which the seasons' performances are judged are the quotients 
of the number of runs by the number of innings, of the number 
of wickets by the number of runs, and so on, all quantities 
being reduced to a common denominator. A consideration 
of the best methods of comparing cricketers or counties, and 
an exposure of the fallacies inherent in the present system, 
would afford a useful exercise in the use of averages and the 
choice of the most appropriate kind. The average in this 
sense is very common in mechanics. The average pressure 
I per square inch, the average work done by an engine per 
' minute, the average speed of a train, are quantities which it 
; is frequently necessary to use. Such an expression as the 
j average rate of interest is precisely similar. 

It will be clear that percentage is a special case of this 
use of average. It is useless when comparing the growths 
ATwagei ai of population or of trade to give only the 
ratat whole numbers. An increase of 50,000 in the 
population of London is not so significant as one of 10,000 
in that of Harrow ; they must be expressed as increases 
of I per cent and 150 per cent, say, before their meaning 
can be appreciated, and this is the same thing as giving the 
average increase to 100 inhabitants. For this reason the 
records of births, deaths, and marriages are always given 
as rates — so many per 1,000 inhabitants ; and in these cases 
a double average is 'given, for the rates signify so many per 
1,000 inhabitants per annum. 
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Another extension of the same use is found when quan- 
tities are reduced to rates " per head " of the population. This 
use is solely for comparison, and the principle employed is 
that of the common denominator. It would be futile to state 
that the amount spent on drink was, say, ;^ 100,000,000 in i860 
and ;f 110,000,000 in 1890; but the corresponding statements 
that the amounts were £3. los. per head in i860 and £2. 15s. 
per head in 1890 would make a comparison possible. Or, to 
take a better instance : in studying the increments in the values 
of England's foreign trade, an entirely wrong view is obtained, 
unless we calculate for each year the value per head of the 
population, instead of looking only at the totals. A neglect 
of this division would make municipal expenditure appear to 
be gfrowing much faster than it really is ; and in preparing 
any comparative summary of figures, it is always necessary 
to consider whether such an average should be taken. 

Pr^iimiiuuy So far, the averages considered are simply 

aeflnitiaL arithmetical, and satisfy the following definition : — 

Average x number to which it applies = total quantity dealt with. 
e.g, . Average weight x number of crew = total weight of crew. 

Average value of imports per head of population x number of 
population = total value of imports, and so on. 

The following question, however, will lead us further. The 

Its inappuoft- average weekly agricultural wages in 1892 in 

wuty. Wilts, Dorset, Devon, Cornwall, and Somerset 

were los., ids., 13s. 6d., 14s., iis. respectively. What was 

the average in the south-west of England? 

The simplest method is to say, the average was 

los. + los. + 13s. 6d. + 14s. + I IS. 58s. 6d. « , 
o 2 _ J _ I ig^ 8.4(1. 

S S 

and for many purposes this would be sufficient ; but it does 
not satisfy the above definition. For when we ask the double 
question "us. 8.4d. multiplied by what number equals what 
total?", we can only answer that us. 8.4d. multiplied by the 
number of items equals the sum of items. 

We must consider further what we understand by the ex- 
pressions "average wage in each county," and "average wage 
in the group of five counties," 
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It may be supposed that the average wage in Wilts, for 
instance, was compiled by getting returns from different villages, 
say I2S., IIS., 9s., 9s. 6d., los. 6d., 9s., 9s., adding them and 
dividing by the number of villages. This of course satisfies 
our definition no better than the former. What is to be 
understood by the average in each village? If our present 
definition is to be satisfied, it should be the total of the wages 
paid in the village divided by the number of workers. It is 
hardly necessary to say that this total is never found in such 
an investigation, and the average is given from observation or 
by guess-work, not by calculation. 

If, however, the village average was correct, and we had 
returns from all the villages in the county, we should find 
the county average as follows : — 

12/ X 200+11/ X 150 + 9/ X 300 + 9/6 X 150+ 10/6 X 4CX)+9/ X 2004-9/ X 200 _ . J. 

200+150+300+150 + 400+200 + 200 ~^' * ' 

where the numbers in the denominator are the numbers of 
labourers in the respective villages. We should then have the 
same result as if we had had the wages of all the labourers 
in the county put down on a sheet, added up, and divided by 
their number, and the average would satisfy the definition. 

It is clear that we can simplify this arithmetical work, 
for if we divide throughout by 50 we get the same result; 
this is as if we said there were 4, 3, 6 . . . labourers in the 
villages instead of 200, 150, .. . Thus we get the same 
result if we take numbers proportional to the total numbers 
of the labourers instead of the actual numbers. This plan 
has two advantages: first, that though we do not know the 
numbers of labourers, we know numbers nearly proportional 
to them, viz., those included in the census returns under the 
general headings relating to agriculture ; and secondly, we need 
not choose our numbers with absolute exactness ; thus the 
numbers of labourers above given may be supposed to be round 
numbers substituted for 213, 145, 320 . . . ; and it will presently 
be seen that such differences hardly affect the average. We 
idealize the village, and suppose it to contain round numbers ; 
and then for the numerical work take simple numbers pro- 
portional to these. This is important as simplifying numerical 
work. 

Averages obtained for the county in this way do not ab- 
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solutely satisfy our definition, but are very nearly equal to 
those that do. We can then proceed to take the average for 
the south-west of England on the same principles. 

B. Weighted Averages. — This discussion introduces and 
gives an example of the very important statistical method 
known as " weighting the average." We may illustrate it 
further from the same figures by considering what weights to 
apply to get this average for South- West England. We may find 
the number of agricultural labourers in the counties and work 
i. ^u *.!_ los. X 20,000 + los. X 30,000 + ^ ,„^ 

out the average thus : — ^^ ^^^ ^ i* ; or we 

** 20,000 + 30,000 + 

may argue that since we have no means of knowing the 

exact numbers of labourers we may as well arrange the 

weights, according to the importance of the counties, say 

20,000, 30,000, &c., from some other point of view, and 

take numbers representing such quantities as the amounts 

of wheat produced, the area, or the rate of increase of 

population. In this particular case these methods would be 

absurd, but in other problems the weights are not so obvious. 

Suppose, for example, that we are considering the attraction 

of London on the inhabitants of various counties ; that we are 

told that so many immigrants arrive from Essex, Norfolk, and 

Suffolk, and so many from Stafford and Worcester, and we are 

asked to compare the attractive power on the agricultural and 

manufacturing counties. Should we weight the numbers given 

by the total numbers of inhabitants of the contributing counties, 

or by their distance from London, or by some quantity derived 

from these? 

A more practical problem, the classical and most useful 

application of weights, is the formation of an index number |^ 

for the change of prices by fitting-suitable weights 

to the changes measured m the prices of various 

commodities. This will be considered separately,* but it is best 

to deal with the first principles here. It is required to find the 

change in the value of gold when measured by the prices of other 

commodities. Suppose that we are given that the prices of 

certain commodities between two years were in the following 

ratios : — 

— - - - - ■ ^- -^ — , — ^-^ ^ 

* See tn/ra^ Chap. IX, 
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V 



Wheau 



First Year - 
Second Year 



loo 
77 



Silver. 



lOO 
60 



Meat. 



100 
90 



Sugar. 


Cotton. 


1 
100 1 ICO 

40 ^ 85 



The simplest way to estimate for the general fall in price is 
to take the simple average of the numbers in the second year, 
viz., 70.4 ; and say that general prices in the second yeai* were 
704 per cent of those in the first, and the value of gold had 
increased in the ratio 100:70.4 when expressed in commodities. 
But it is at once clear that we cannot allow the commodities 
given to have equal influences on the result ; wheat is of greater 
importance than sugar and meat than silver ; and again we 
have taken arbitrarily three items to represent food and one for 
clothing ; we need some means of deciding relative importance. 
Suppose we decide that wheat, cotton, meat, and sugar are 
respectively 7, 4, 3 times and twice as important as silver, we 
should get the following table : — 



r><«m«#wi:»« Relative Price in 
Commodity. g^^^^ y^ 


Weight Assigned. 


PfXMittCt. 


Wheat- 

Silver ... - 

Meat .... 

Sugar .... 

Cotton 


77 
60 

90 
40 

85 
352 


7 

I 

3 

2 

17 


539 
60 

270 

80 

340 

1289 



1289 
Weighted average is —^ = 75.8 



Unweighted average 



17 
352 



= 70.4 



This process is equivalent to writing down the price of wheat 
seven times, silver once, meat thrice, &c., and then taking the 
simple average of these numbers. 

The idea is made clearer by the mechanical analogy in which 

the word weight originated. Suppose a uniform weightless rigid 

Ksohaiiioai rod graduated in 100 equal divisions, and equal 

lUutratioB. weights hung at the 77th, 60th, 90th, 40th, and 

85th divisions from one end ; the rod will then balance at a 

point corresponding to the unweighted average, 704 intervals 
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from the same end. Now, suppose the equal weights replaced 
by weights of 7, i, 3, 2, 4 lbs. respectively, and the rod will 
balance at a point corresponding to the weighted averages 
75.8 intervals from the same end. The further any particular 
mass is moved, or the heavier it is, the more the centre of 
gravity will be shifted ; and this clearly corresponds to the 
influence we should wish the various prices to have in the 

statistical problem. The formula in use in Statics, x = -^^j 

which corresponds to the arithmetic on the previous page, can 
also be used in Statistics. 

The discussion of the proper weights to be used in this and 
other averages has occupied a space in statistical literature out 
of all proportion to its significance, for it may be said at once 
that no great importance need be attached to the special 
choice of weights ; one of the most convenient facts of 
The "m*^" affaot Statistical theory is that, given certain condi- 

ofwoigiiti. tions, the same result is obtained whatever logical 
system of weights is applied. We must postpone the mathe- 
matical analysis of this proposition, but may offer immediately 
some arithmetical illustrations. 

The table on the next page affords an example of this prin- 
ciple,* and is worth careful study. At the commencement of the 
Bxampia from Wage Census, circulars were sent to all the principal 
the Wage oeniui. fif nis in all well-located trades, asking for details 
as to wages. Of these some were not returned, and the 
numbers allotted in the Final Report to each trade are not the 
numbers which actually belong to the trade in the whole 
country, but the numbers of those in the firms which made 
returns. The average wage given is not therefore the arithmetic 
average for these trades for the whole country corresponding 
to the definition given above for average, but the average of 
the average wages as returned in each trade weighted by the 
numbers for whom returns were made ; so that the average 
wage given for the whole group of trades might have proved to 
be different, if with the same average in each trade the returns had 
been complete. It is very unlikely, however, that there would 
have been any great difference. In the table several systems of 
weighting are used ; the first are the numbers in these returns, 
giving an average, 24s. 7d. ; the second are the numbers be- 

* From the Statistical Journal^ December 1897, with corrections. 

H 
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Examples of the Smallness of the Change Introduced by 
Difference in Systems of Weighting. 



From the Wage Census. 






Numbers 














Employed 
in Trade 
when • 


Arbitrary 
System of 
Weights. 


Equal 
Weights. 




Average 


Number 


Trade. 


Wages 


Included 


known. 






(Men). 


in Returns. 


Unit 1,000 








s» 


d. 










Cotton Manufacture 


25 


3 


32,189 


142 


144 




Woollen It ... 


23 


2 


12,248 


54 


172 




Worsted and Stuff Manufacture - 


23 


4 


7»«>5 


38 


219 




Linen Manufacture • 


19 


9 


6,807 


22 


96 




Jute tt ... 


19 


4 


2,799 


9 


23 




'Hemp, &c., /f 


23 


6 


1,232 


3 


78 




Silk tt ... 


22 


3 


2,248 


10 


189 




Carpet tt ... 


26 


7 


1,292 





^i3 




Hosiery it ... 


24 


5 


1,070 


8 


287 




Lace It ... 


27 


3 


593 


8 


51 




Smallwares n - - - 


20 


2 


2.734 





225 




Flock and Shoddy Manufacture • 


21 


2 


330 


2 


200 




Coal, Iron Ore, and Ironstone 














Mines 


22 


II 


67,429 


57 


142 




Metalliferous Mines 


16 


6 


5,046 





190 




Shale Mines and Paraffin Oil Works 


25 





3,021 





207 




Slate Mines and Quarries 


22 


I 


6,933 


\ 


232 




Granite Quarries and Works 


21 


II 


2,315 


- 12 


206 




Stone Quarries - . - - 


23 


10 


3»956 


J 


34 




China, Clay, &c., Works 


18 


8 


499 





39 




Police 


27 


7 


52,682 


58 


224 




Roads, Pavements, and Sewers - 


20 


9 


24,276 





29 




Gasworks 


27 


2 


27,965 





40 




Waterworks .... 


24 


9 


5,187 





151 




Pig Iron (Blast Furnaces) - 


24 


6 


6,234 





128 




General Engineering Iron and 














Brass Foundries and Machinery 














Trades - - 


25 


9 


41,658 


200 


^7Z 




Shipbuilding, Iron and Steel 


29 


3 


10,661 


80 


228 




Tinplate Works - - - - 


33 


5 


11,514 





178 




Saw Mills 


24 


3 


2,088 





174 




Brass Works and Metal Wares 


29 


7 


1,838 





222 




Shipbuilding, Wood - - 


28 


4 


454 





79 




Cooperage Works 


30 


5 


327 





165 




Coach and Carriage Building 


26 


6 


1,664 





28 




Boot and Shoe Making 


24 


3 


2,902 





142 




Breweries 


24 


3 


8,366 





46 




Distilleries 


20 


4 


1,795 





129 




Brick and TUe, &c., Making 


22 


10 


3.188 





55 




Chemical Manure Works 


23 





1,054 





210 




Railway Carriage and Wagon 














Building .... 


25 


2 


2,239 





233 


I 


s, d. 


J. d. 


s. d. 


s, d. 


Averages 


■ • 


• 


24 7 


25 3 


24 5i 


24 2 
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longing to each trade according to the census when they are above 
a certain minimum, giving an average 255. 3d. ; the third is a 
purely arbitrary list of figures taken from a source which has no 
connection with wages, and the average is 24s. sjd. ; the last is 
the unweighted average, that is, all the weights are equal, and the 
average is now 24s. 2d. These averages are close together, while 
the original items vary from i6s. 6d. to 30s. sd. It is to be 
noticed that the true weights are not known in this case, but 
that owing to this principle we are able to dispense with them 
entirely. 

The problem dealt with in the next table is to find the aver- 
age weekly agricultural wage in England and Wales from the 

returns for Michaelmas 1869 and Lady Day 1870, 
average uder given in columns I and 2. There are very many 
many Byatems different ways of taking this average, some of 
which are as follows : — Take the average of summer 
and autumn for each county, as in column 3, and then the un- 
weighted average of these 45 numbers ; this is 12s. 7d. Suppose 
the summer wage to be paid twice as long as the autumn wage, 
as in column 4, and proceed as before; the average is 12s. Sjd., 
the slight difference being due to the inclusion of harvest pay- 
ments in the Michaelmas wage, which makes them higher on the 
whole than the summer wages. Again, divide the counties into 
geographical groups, take the simple average for each group 
(the figures marked a in column 3 and b in column 4) and 
weight these by the figures marked c in column 5, the numbers 
of agricultural labourers in each group ; the average of the a 
figures with the c weights is 12s. 5d., of the b figures with the c 
weights is 12s. 4d. Again, weight the figures for each county 
in column 4 with the numbers in column S, the most obvious 
method of all; the average is then 12s. 4d. Again, take the 
simple average of the district averages a and by that is, give each 
of the eight districts equal weights; the averages are 12s. 4|d. 
and I2S. 3^d. Or take the simple average of column 3, counting 
Yorkshire and Wales each as one county ; it is 12s. 8d. 

To obtain new groups, take as weights not the number of 
agricultural labourers, but the total population of the districts, 
the numbers marked d. Exclude the population of London as 
exerting a preponderating influence unconnected with agriculture. 
A new factor is now introduced, for population is greatest in the 
manufacturing districts, where agricultural labour is of compara- 
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Agricultural Wages in 1870. 
Illustrations of Various Methods of Weightingy and their Results, 







X. 


2. 


3. 


Average 


5- 
No. of 


6. 
Whole 






Michael- 


Lady Day 
1870. 


Average 


. of 


Agricultural 


Population 






mas 


of Cols. 


Col. axa 


Labourers 


in Groups. 






1869. 


X and a. 


and 


in Groups. 


Unit 












Col. I. 


Unit z,ooo. 


100,000. 


X. J. 


s. d. 


*. d. 


s, d. 






Sussex - 


- 


12 3 


12 


12 \\ 


12 1 


34 


.«• 


Surrey - 


- 


14 


13 6 


13 9 


13 8 


16 


... 


Kent - 


- 


14 6 


14 


14 3 


14 2 


44 


... 


Hants • 


• s 


II 


10 6 


10 9 


10 8 


32 


... 


Berks - 


Average 


12 


10 


II 


10 8 


22 


... 


• ■ • 


• 
• • • 


a 12 44 


^12 3 


<ri48 


^22 


Herts - 


• 


14 7 


II 10 


13 2i 


12 9 


20 


• • ■ 


Northants 


- 


12 6 


II 6 


12 


II 10 


23 


• • • 


Hunts - 


- 


16 


II 


13 6 


12 8 


9 


• t • 


Bedford 


- 


13 


12 


12 6 


12 4 


17 


• ■ • 


Camb. - 


Average 


II 


12 


II 6 


II 8 


24 


• • • 


•* • 


• • • 


a 12 6 


^12 3 


c 93 


d 14 


Essex • 


. • 


12 6 


II 


II 9 


II 6 


45 


• • • 


Suffolk - 


- 


10 6 


II 


10 9 


10 10 


41 


■ • • 


Norfolk 


• •• 

Average 


II 6 


II 6 


II 6 


II 6 


44 


• ■ • 


« • • 


• • • 


a II 4 


^11 3 


c 130 


d\2 


Wilts - 


. 


II 


10 3 


10 74 


10 6 


26 


• « • 


Dorset - 


- 


9 6 


10 3 


9ioi 


10 


17 


» • • 


Devon - 


• m 


10 


10 3 


10 li 


10 2 


34 


• • ■ 


Cornwall 


. 


II 


II 


11 


II 


17 


• • • 


Somerset 


Average 


II 


10 6 


10 9 


10 8 


31 


• • • 


... 


• • ■ 


a 10 6 


^10 6 


c 125 


d 19 


Stafford 




13 


13 


13 


13 


19 


■ • • 


Gloucester 




II 9 


10 9 


" 3 


II I 


22 


• • • 


Hereford 




10 3 


10 


10 i\ 


II I 


12 


• • • 


Salop - 




II 


II 6 


II 3 


II 4 


21 


• • • 


Worcester 




13 6 


II 


12 3 


II 6 


15 


« • t 


Warwick 


Average 


13 6 


12 


12 9 


12 6 


20 


• • • 


t * • 


■ • • 


a II 9 


^11 7 


c 109 


dTj 


Leicester 


. 


14 


13 


13 6 


13 4 


15 


• • • 


Rutland 


• m 


12 6 


12 


12 3 


12 2 


3 


• • • 


Lincoln 


• 


14 


13 6 


13 9 


13 8 


49 


■ • • 


Notts . 


• 


13 6 


13 


13 3 


13 2 


16 


• • • 


Derby - 


m m 

Average 


13 6 


14 


13 9 


13 10 


8 


■ ■ • 


... 


• • ■ 


«i3 3i 


^13 3 


c 91 


d\i^ 
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"7 





I. 


2. 


3. 


4. 
Average 


5* 
No. of 


6. 
Whole 




Michael- 
mas 


Lady Day 
1870. 


Averace 
of Cols. 


of 
Col. 2Xa 


Agricultural 
Labourers 


Population 
in Groups. 




X869. 


z and a. 


and 


in Groups. 


Unit 










Col. z. 


Unit 1,000. 


IOO|00(X 


s. d. 


s. d. 


s. d. 


*. d. 






Cheshire 


13 6 


13 6 


13 6 


13 6 


18 


• • • 


Lanes. - 


15 


IS 


IS 


IS 


30 


• • • 


Yorks, W. . 


19 


15 3 


17 ij 


16 6 


30 


• • • 


Yorks, N. - 


17 4 


13 6 


IS S 


H 9* 


16 


• • • 


Durham 


16 6 


16 


16 3 


16 2 


8 




Northumberland 


19 6 


16 6 


18 


17 6 


12 


• • • 


Cumberland - 


15 


15 


15 


15 


10 


• • • 


Westmoreland 
Average 


16 3 


IS 6 


iSioi 


IS 9 


3 


• • • 


• • ■ 


• ■ • 


ais 9 


//IS 6 


c 127 


dii 


Monmouth - 


12 6 


n 9 


13 14 


13 4 


6 


« • • 


Walks— 














Glamorgan 


14 6 


14 6 


14 6 


14 6 


S 


■ • ■ 


Caermarthen 


12 4 


II 6 


II II 


II 94 


4 


• • • 


Pembroke - 


II 


10 


10 6 


10 4 


4 


■ • • 


Cardigan • 


9 


8 6 


8 9 


8 8 


S 


• ■ • 


Brecknock - 


12 


12 


12 


12 


4 


• • • 


Radnor 


10 


10 


10 


10 


2 


■ • • 


Carnarvon - 
Average 


12 


12 


12 


12 


S 


• • t 


■ ■ • 


... 


a II 7 


^11 7 


<^3S 


dl4 



lively little importance, but receives high wages; these high 
wages have undue weight, and the average of the figures 6 with 
weights a is brought up to 13s. ifd. The "median" of the 
county averages in column 4 is 12s. id. If column 4 is rewritten 
correct only to the nearest is., and column 5 to the nearest 
10,000, the weighted average is I2sl 5d. If column 3 is 
weighted with random numbers quite unconnected with the 
problem, viz., the successive digits in the third decimal places 
of the logarithms of the numbers 2 to 46, the average is 12s. 
lofd. The reader may try any other system of logical or 
absurd weights, and he will find that unless there is some bias 
in the selection of weights, or great preponderance is given to a 
few counties, Ihat the average will be little affected. 

Since the true system of weights which would reduce the 
general average to our definition must be allied to some of those 
here adopted, and can hardly show greater divergence from 
I2s. 4d. than these do, we may feel confident that the true 



Il8 ELEMENTS OF STATISTICS. 

average is within, say, 3d. of this figure. The original items 
varied from 8s. 6d. to 19s. ; the averages, even those based 
on the most extravagant methods, are contained by the limits 
I2S. and 13s. ifd. Without some such argument as this we 
should have no clue to the magnitude of the error introduced by 
erroneous weights. This is of the greater importance, because 
in many statistical questions the true weights are undefinable or 
incalculable ; now it is seen that, given certain conditions, there 
is no need to calculate or define the weights. Notice, however, 
that no system of weights can remove an original bias common 
to all the figures ; if, for example, winter wages throughout 
were is. less than here reckoned, the corresponding deficit would 
appear unchanged in all the averages found. So we arrive at a 
very important precept ; in calculating averages give all your 
care to making the items free from biaSy and leave the weights to 
take care of themselves. 

C. The Mode. — Passing now from the discussion of the 
arithmetic average and its development the weighted average, 
let us consider two other means in common use among statis- 
ticians but unfortunately not yet consciously introduced into 
common parlance. There are, however, some popular phrases 
which, if they have any definite meaning, very nearly resemble 
the averages in question. When we hear of the average clerk. 
The ftTorase the average undergraduate, the average working- 
°^"^ man, the phrases admit many interpretations. In 

some way these persons are supposed to be types of their kind. 
The average clerk may be supposed to mean the one who 
receives the average income of all clerks, whose expenditure on 
necessaries and on luxuries is the average of all of his class, who 
takes the average amount of interest in his work, is of average 
ability and average age, perhaps also of average height and 
weight. It will be seen that this clerk is ideal, and not to be 
found in any random assembly of half-a-dozen ; for each 
of these will have some peculiarity, some quality in which he 
differs from the average ; the average man of the newspapers 
does not exist in the flesh, but is an imaginary person to whom 
certain attributes are attached. 

Quetelet's average man is familiar ; * he is of average height, 

* See Quetelet's Physique Sociale; and Edgeworth in Statistical Journal ^ 
December 1893. 
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weight, Strength, girth and lung capacity, with eyes of normal 

Qa0tei«tii range and medium tint; but he is a more satis- 
ftTwage muL factory model than the newspapers' average, for in 
regarding him we see the type from which all other men may be 
supposed to have deviated ; the creature that would have been 
produced if all disturbing causes were removed. That any actual 
person should answer exactly to all these standards is of course 
in the highest degree improbable. 

Quetelet refers neither to the arithmetic average, the median 
nor mode, but to a mean about which all the similar measure- 
ments are grouped in accordance with a definite law, the obedience 
of anthropometrical measurements to which was his chief theme. 

The newspaper average, on the other hand, seems to be the 
mode, the position of the greatest density, which may be ex- 
nujd^ plained as follows : — Referring back to the table of 
American wages, p. 91, or the table on next page, 
it will be noticed that in looking down column 2 we find the 
numbers increase till we come to 685 (between $1.15 and $1.24)^ 
and then after fluctuations diminish. This number, 685, is the 
greatest which occurs in any lo-cent group ; and its position is 
called the mode, or the position oi greatest density ^ or the position 
of the maximum ordinate^ or the rate is spoken of as predominant. 

In this column 2 we have, however, 14 maxima in the 
correct sense of the word, the numbers rise and fall with little 

Method of regularity, and there are I4 modes of which that at 
dotenniiiiiig $i.i5-$i.24 is the most pronounced. But if the 

the mode. groups are made wider, and the numbers entered 
as in column 6 in half-dollar limits, there are only three modes, 
or if we neglect the small group of 8 at $5.00 only two. The 
position of the largest group of 1,472 is not at once Sissignable 
more closely than as between .75 and 1.25. We can get a little 
closer by the following method : — 

Numbers earning as much as $0.65, and not as much as $1.15 944 

>i n »> 0-75 M » 1-25 1,472 

0.85 n » 1.35 i»458 

0-95 »» » 1-45 i>747 

1.05 „ „ 1.55 2,012 

1. 15 „ „ 1.65 1,780 

1-25 » n 1-75 1.297 

1.35 »> 1. 1-85 1,527 

1-45 » » 1-95 1,127 

1-55 ,. „ 2.05 934 



If » )> 

»i ,, >i 

>i »> »} 

91 » » 

}} }, »» 

>» J> •> 

>l 91 99 

99 99 99 
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Determination of the Mode. 
Numbers of Wage-Earners from the Senate Report^ i393, U.S. A. 
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Now the greatest number is in the group $1.05-$ 1.55, and 
the "mode" may be stated as near the middle point of the 
group, viz., $1.30, not at this point, for there are only 99 wage- 
earners in the group $1.25-$!. 34. 

Another method of approximating to the mode may be 
illustrated as follows : — When the numbers are tabulated in 
lO-cent groups, as on p. 91, the mode is quite indeterminate; 
in 20-cent groups the successive numbers beginning at .25-.44 
are 16, 144, 270, 370, 989, 557, 538, 531, &c., and the number 
989 (in the group $1.05-$!. 24) is a distinct mode ; if we begin 
the 20-cent groups at .3S-.54, the numbers are 74, 242, 282, 505, 
784, 924, 274, &c., and 924 (in the group $1.3 5 -$1.54) is a mode ; 
by this double tabulation it is seen that the 20-cent grouping 
does not decide the mode. In 30-cent groups we have 355, 674, 
1,242 ($i.is-$i.44), 740, &c., if we begin with $.SS-$.84; we 
have 439, 1,190 ($95-$!. 24), 1,023, &c., if we begin with $.6s-$94 ; 
and 483, 1,088 ($i.os-$i.34), 996, &c., if we begin with $.75- 
$1.04: the modes by each of these groupings lies in a group 
which contains $1.15 to $1.24, and this smaller group maybe 
assumed to contain the mode, which is thus at or near $1.20. 
The example here taken is drawn from a group of very irregular 
figures, which specially illustrate the difficulties. The method 
just adopted may be summarised thus : — Tabulate the figures 
again and again in gradually widening groups till regularity is 
obtained; then examine again the groups which have the selected 
width and see if the mode is shifted when the lower limit of the 
grouping is moved ; if it is shifted the groups are not wide 
enough ; if it is not, .the mode is in the smallest group common 
to the larger equal groups which all contain it. A more accurate 
diagrammatic method is described on p. 154. 

Even when our numbers are initially regular, it is seldom 

ind«fl]iiteii»ii ^^^y ^^ determine the mode exactly. The diffi- 

of the pofition culty is best seen by an example. Suppose that 

of tbo mod«. ^^ j^^^^ ^j^^ following returns as to heights of 

a large number of men : — 

67 in. - - 455 

67J » - - 475 

67i M - - 490 

67I » - - 500 

68 „ - - 485 
68| „ . - 467 
68i „ - - 445 
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At first sight the mode appears to be at 67! in. exactly ; but it 
must be remembered that even in accurate measurements all 
heights within | in. of 67^ in. will be entered as 67! if the 
measurements are taken to the nearest quarter inch, or will have 
been tabulated in this way if the measurements were more accu- 
rate. Hence 67J in. in reality stands for from 67I to 6yl in. 
If the 500 heights so entered were distributed uniformly through 
this interval, the mode might be given with 6yl in. with fair 
accuracy; but there are signs in the figures that the mode is 
below this. Suppose that the figures in reality come from the 
following measurements : — 

From 67i to 67I in. 238 \ « ,3 j^ 

» 67f ,. 67i „ 245 / ^^^ "' ^^« '''• 

67^ n 67I „ 245 ^ ;q. at67« 

67I ,, 67f „ 250 / ^95 at 67^ „ 

6.'? " 68^ " IT^ } ^93 at 671 „ 
o7f »» 08 „ 243 J 

68 „ 68^ „ 242 

and that these had been tabulated as in the last column, the 
mode would appear as 67I in. ; while the same figures tabulated 
as before gave it as 6^^ in. The probability of some such 
shifting is seen from the original grouping, where the number at 
67J in. is greater than that at 68 in. From this discussion we 
may see that the mode is always a little indefinite, depending on 
the width of the groups in which the items are tabulated, and on 
the exact position of the limits of the groups. As the items we 
deal with become more numerous, we shall find regularity when 
they are tabulated in narrower groups, and the mode can be 
assigned with greater accuracy. A more satisfactory method of 
determining the mode is that given on p. 155. 

Now is the "average workman" the man who earns $1.73 
per diem, the simple average of the whole group on p. 120, or a 
The "avorage ^1^^ making $i.20 the mode? In ordinary speech 
™*"-" the latter is meant. The " average clerk " is not ] 
the one whose measurable qualities are an arithmetic mean of 
b\1 similar qualities, but one whose qualities are found in the i 
same degree in the greatest number of his fellows. There are 
rlerks who read the evening paper than who read Homer, 
10 go to music-halls than to oratorios, more whose 
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incomes are ;^ioo than £soOy more who live four miles from 
the City than one or twenty. Even with this explanation the 
average man is not a real creature, for fortunately no individual 
has no qualities out of the common. The fact that the average 
is a pure abstraction is of importance directly we apply statistics 
to actual affairs; these American workpeople cannot be legislated 
for in the mass as if they all earned $1.20, or as if those who 
were alike in this did not differ in other respects, even doing very 
varying quantities of work for this wage. No single measure- 
ment expresses completely even the economic condition of a 

importanoo of group of workmen, but if we are taking a single 
the mode. measurement, that of the "mode" is often the 
most useful. It is at the mode that we find the greatest number 
of whose greatest good we may be thinking. Whereas the 
arithmetic mean and the." median " may correspond to no reality 
but be merely numerical conceptions, the mode is precisely that 
number for which most instances can be found. It shows the 
commonest result, that most often obtained, and is of very 
general application. For an intending passenger by train or 'bus, 
it is more important to know the most ordinary than to know the 
average number in a compartment. The mode rather than the 
average in chest measurements is the number most suitable for 
the ready-made clothier. For providing a post-office or a store, 
the mode in postal orders or prices of tea needs to be known 
rather than any other average. Even the favourite coin in a 
collection may show the spirit of the congregation better than 
the arithmetic average of their contributions. In these last 
instances it may be noticed that the mode is quite definite. 

A special feature of the mode is that it is entirely uninfluenced 
by extremes. A cheque for ;^ 1,000 in a collection disturbs the 

Advantagei of arithmetic average, but not the mode. The incomes 
the mode. of a small number of millionaires and an army of 
paupers may have the same arithmetic average as a nation com- 
posed entirely of people moderately well off; but the modes will 
be very different in the two cases. In considering the change 
year by year in a group of figures, as for instance, the wages of a 
large group of workmen, wc cannot tell, if we take the arithmetic 
average as our criterion, whether an improvement is due to a 
levelling up of the badly paid or a rapid increase for those who 
were already well off, while the mode will show the changing 
position of the main body. Mr Booth's " London " is crowded 
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with instances of this maximum density method. Each age 
diagram shows the mode in ages for an occupation ; each wage 
list the mode in wages. His whole description of Class ^, the 
typical workman of modern towns, is based on the same prin- 
ciple. His measurement of social status, based on the number 
of rooms occupied or servants employed, can be used more easily 
for stating the mode (four rooms to a family and no servant) 
than any other average. 

An objection to this average is that there are many groups 

of figures to which it is not applicable. If we have a very irre- 

shortoomings of gular group of numbers with no particular type, 

the mode. j^uch as the populations of towns in England, 
the mode would be quite indefinite, or if found, would give no 
information of importance. The use of the mode is to indicate 
the type from which other figures may be regarded as diverging. 
Thus, in these wage figures, the type is about $1.20, and other 
examples lie on cither side, wages of men who have for some 
reason or other above or below the normal degree of skill or 
opportunity. If there is a type, as in Quetelet's instances, the 
mode will show it. The mode only tells us one fact, however, 
about each type, whereas the methods already given (p. 92) show 
us several. 

D. The Median. — The median, with its dependents, the 
quartiles, deciles, and percentiles, has already been used on 
p. 92. Arrange all the items of the group in ascending order 
of magnitude ; the item half-way up the list is the median ; 
those one-quarter and three-quarters up are the quartiles ; those 
one, two . . . nine-tenths up are the deciles ; those one, two 
. . . ninety-nine hundreds up are the percentiles. The median 
is the most useful of the averages ; so useful that it is worth an 
Advantages of effort to engraft the word and its meaning on the 
the median, public and official minds, where perhaps it may 
bear fruit by the year 2,000.* It is very nearly definite in position, 
thereby differing from the mode ; if we have an odd number of 
items, it is the middle one ; if an even number, it lies between 
the two middle items, which are generally very near together, or 
coincides with them if they are equal. It is not affected by 



* While this was in the press, Mr Wilson Fox's Report on the Agricul- 
tural Labourer was published ; on p. 25, the median is explicitly used. 
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exceptional entries at all ; the existence of any number of 
millionaires has no more effect on the median income than of an 
equal number of any other persons whose incomes are above the 
median. For many purposes it is of course necessary to allow 
these extreme instances more weight than those which are nearer 
the average ; but the arithmetic average often gives them undue 
weight for this democratic age, since a single millionaire can 
counterbalance thousands of ordinary working men. A further 
advantage is that it is extremely simple to find, not needing 
much arithmetical work, for we need not do more than count 
those well above and well below the average, and look more 
carefully at those near it. 

There is a yet more important advantage in the use of the 

median ; it can often be found exactly, when our information as 

No seed for ^^ ^^^ items in question is neither accurate nor 

oompiete infor- complete. This will be clear from one or two 

""'^^^ examples. It maybe that in the "wage census" 
100,000 persons, whose wages were far below the average, 
did not come into the returns at all, and it is very difficult 
to estimate their effect on the arithmetic average, for want 
of information as to their earnings ; but to find the median 
exactly, we need only know their number, not their earnings ; 
and if we can only assign a maximum for their number, we still 
can place the median within narrow limits. The addition of 
100,000 men with wages below 15s. to the general summary for 
the 356,000 men, would still leave the median in' the group 
20S. to 25s. where it already is; the change would be very 
marked, however, in the lower deciles and quartiles, and the 
arithmetic average would be lowered by at least 2s. id. The 
same argument applies to incomes ; information is often very 
deficient, but it is in many cases possible to assert that a number 
of men, whose exact income is unknown, receive above a certain 
assigned sum, or even between two assigned limits, which is all 
we need to know about them to determine the median, if it lies 
below the lower limit. 

Again, in tracing the history of wages throughout the century 
it is often very difficult to find the correct average, but at the 
same time it is frequently possible to say that a very large class 
of men earned below, say, i ss. a week, and another very large 
class above 30s. whose wages we do not exactly know, and a 
more definite number between 15s. and 20s., and 25s. and 30s. ; 
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and in order to find the median all we need to do is to investi- 
gate more exactly the wages between 20s. and 25s., and even if 
we have not complete information here, we can still say that the 
median certainly lies between certain narrow limits. There is 
yet another advantage, perhaps more important, that the median 
inoommensnr. is applicable to quantities which are not capable 
able quantiueB. of measurement at all. This development is especi- 
ally due to Mr F. Galton.* Suppose it to be required, for example, 
to find among a large class of boys the average in intelligence. 
It is clear that it is not easy to find the arithmetic average of a 
quantity which cannot be properly measured even by the most 
elaborate system of marks, but on the other hand it would not 
be at all difficult with a class of, say, twenty boys, to place them 
in order of intelligence without committing oneself to such a 
statement as that A.'s cleverness was 25 per cent, more than 
B.'s; and the tenth or eleventh boy in this -arrangement 
would show the style of boys in the class, at least as well 
as any other average. The disadvantage of this method, the 
reason why it is not universally applicable, is that the median 

of a series of observations may be totally removed 
from its type, and in fact may not be situated near 
any of the different objects which are observed. Thus, if we 
had two large groups of wages of a thousand men between iss. 
and 25s., and another thousand between 35s. and 45s., the median 
would give us any position between 25s. and 3Ss., where as a 
matter of fact not a single wage-earner would be found. The 
median is then chiefly useful when we are dealing with a series 
of objects of which the main part lie fairly close together ; a few 
extremes do not affect itf 

The following table shows the description of 76 items by the 
help of the various averages now described : — 

* See, for instance, Natural Inheritance^ p. 47. 

+ On the relative advantages of this, and a more mathematical method, 
see Yule and Galton in the Statistical Journal for 1896, especially pp. 
392-398. 
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13. 10 


5.4 


7.2 


between ages 13 


28 


13.9 


4.9 


5." 


66 


14.0 


4.9 


5.0J 


and 134 years, 


29 


13-4 


5.1* 


5-2 


67 


13.3 


4.7 


5.0 


5 St. 94 lbs. ; 134 


30 


14.4 


5-x 


6.84 


68 


13.8 


4.11 


6.ii 


and 14 years, 5 st 


31 


14.10 


4.9i 


4.7i 


69 


13.7 


4.iii 


6.4J 


134 lbs. ; 14 and 


32 


13.2 


4-94 


5-i3i 


70 


13. XX 


4.8 


4.44 


144 years, 6 St. 34 


33 


14. 1 


4.8} 


5.84 


71 


I3.XI 


4.8 


4-44 


lbs.; 144 and 15 


34 


13.10 


5.24 


6.8i 


72 


13.2 


4.73 


4.10 


years, 6 St. 8} lbs. 


35 


14.0 


4. 1 14 


5.7 


73 


14.0 


4.11 


6.5 




36 


14.4 


4.11 


!-5« 


74 


X3.3 


4.34 


4.x J 


Heights may be 


37 


14.8 


4.11 


6.oi 


75 


X3.3 


5.0 


7. 2 J 


tabulated in the 


38 


13.7 


5.0S 


6.2 


76 


X3.7 


4.8J 


5.6 


same way. 



A graphic method of finding the median closely is given by 
oraphio Mr Galton in the Report of the Anthropometric 
method. Committee of the British Association, 1 88 1, p. 247 ; 
and is illustrated by the diagram facing the next page. 

On a horizontal line mark off equal intervals represent- 
ing units of measurement, say inches. On a vertical scale 
mark off equal intervals representing the number of instances, 
e^,, persons whose heights are measured. Beginning at the 
lowest, say 51^ inches, on an imaginary vertical line mark as 
many dots at equal intervals on the vertical scale as there are 



128 ELEMENTS OF STATISTICS. 

persons at that height, so that each dot represents one person. 
From the highest dot thus marked, suppose a horizontal h'ne 
drawn till it is over the next height division, 51 J inches, and 
with this new base proceed as before, marking each instance 
at 51^ inches by a dot vertically above the sij-inch mark. 
Next draw a connected line through the middle points of the 
consecutive vertical rows of dots ; if there is an odd number 
of dots, the middle one is taken as the middle point ; if an even 
number, the middle point is half-way between the middle ones. 

On the vertical scale mark the positions of the median, 
quartilcs, &c., obtained by dividing the distance representing 
the total number of instances into appropriate parts, and 
through these points draw horizontal lines to intersect the 
connected line already drawn. The points of intersection 
lie vertically above the heights required, as marked on the 
horizontal scale. 

Now it may be assumed that the heights of all persons 
returned at, say, 58J inches, are in reality evenly distributed 
between the limits 58I and 58J inches, heights lying within 
which would be so returned ; and it can be verified that the 
construction just given shows the place of the median, deciles, 
&c., almost exactly on this hypothesis. 

E. Geometric Mean. — It is not necessary to give a long 
discussion of the geometric or logarithmic mean, for its applica- 
tion is limited to a small class of figures which will be best 
dealt with at a later stage.* It was used by Jevons in 
his essay on the Fall in the Value of Gold, but he did 
not justify or explain its use. If we have ;/ quantities 

a^y ^21 • • • ^n» their geometric mean is \/^i- ^2- • • • ^n- Its 
chief advantage is that the influence of large numbers is 
diminished and of small numbers increased, when the geo- 
metric mean is employed instead of the simple average. 
Suppose all the following groups of numbers represent price 
levels of various commodities as percentages of their height 
at a previous date : — 



Numbers. 


Arithmetic 
Mean. 


Geometric 
Mean. 


80, 160 


120 


113 


80, 80, 100, 324 


146 


120 


20, 20, 80, 80, 120, 120 


73.3 


57 


20, 20, 80, 80, 100, 100, 120, 160, 324, 972 


198 


104 


♦ See p. 223, infra. 







GRAPHIC METHOD OF FINDING MEDIAN, QUARTILES AND 

DECILES (after Galton : Athropometric Commitlee : Btit. Aim.). 
For the Hei^U of the 76 boys, between ages of 13 and 15, lUted od p. 127. 



Median 59^ inches. 



* Probable error ' 3. j. 
Deciles 55.5, 56.6, 57, 57.9, 
63.6, 61^60,6, 59.7. 



Arithmetic avenge, 59-095. 
Greatest density ^7 or 59. 
„ „ in smoothed 

curve would be about 58. 
Geometric avetage 58.98. 
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A consideration of the last list leads to the conclusion that the 
general rise of price cannot be 98 per cent, while 4 per cent, may 
reasonably represent it. A tentative rule may be suggested : when 
the geometric mean differs much from the arithmetic it should be 
preferred. It should be calculated with the help of logarithms. 

F. Statistical Coefficients. — Before leaving the sub- 
ject of averages, we must pay some attention to "statistical 
coefficients." A statistical coefficient is a number, whole or 
fractional, by which a total {e.g,y population) must be multi- 
plied to give an allied number {e,g.y number of births). Thus, 
if the birth-rate is 40 per 1,000, the coefficient is .04. These 
coefficients play an important part in ordinary statistics and 
a very interesting rdle in the application of the law of error 
to demography. The population may increase or diminish, 
but the coefficients relating to certain numbers remain almost 
unchanged,* and by their use the statistics of different coun- 
tries may be compared, and numbers for future years can 
be forecasted in some cases with marvellous accuracy; the 
numbers of births, marriages, deaths in 1901 can be written 
down before their occurrence as exactly as they are needed, 
subject only to the chance of some great catastrophe. Coeffi- 
cients can be formed for births (in various districts), for deaths 
(according to age, profession, or disease), for marriages (at 
various ages), for suicides, crimes, accidents, consumption of 
various commodities ; if the preliminary data could be obtained, 
for the number of persons crossing Westminster Bridge in the 
year, the number of visitors to the Monument, the number of 
umbrellas left in the train, and so on ; the list could be 
prolonged indefinitely. The more important coefficients are 
calculated for most civilised countries, and the rates on which 
they are based published in statistical abstracts. A knowledge 
of them is necessary for statistical investigations. 

A useful caution is given by Dr Bertillon.t In order that 

a coefficient may obey the laws of coefficients closely, the 

caiouiation of number to which it is to be applied should not 

ooefflaienti. \^ that of the total population, but the number 
of persons or things capable of affording an instance of the 
resulting total. Suppose m to be this number of persons, c the 

coefficient, n the resulting total, then n^cm and ^=^. Thus, 

♦ See infra^ Part II. t Cours iUmentaire^ p. 94 seq, 

I 
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tor the marriage rate, tn should not be the total population, 
but the number of marriageable people. The importance of 
this rule is, however, not great as far as simple calculations 
are concerned ; for the less accurate coefficient can be easily 
seen to be nearly constant. Suppose M to be the total popu- 
lation, m the number of marriageable people, n the number 
of marriages. Let q be the coeflficient for calculating the 

number of marriageable people, then c^ = ^- ; c^ the coeflficient 
for marriages on Bertillon's principle, then c^ — -^- Let ^8=m' ^^ 

more usual coeflficient for marriages. ^8=-x j^=^i X Tj. Now 

if c^ and c^ are invariable, so also is ^3. If, however, one of 
the factors, say c^^ is more variable than the other, then ^3 
varies as much as c^^ and the greater constancy of ^g is not 

discovered, if only c^ is calculated. 

« 

G. General. — The function of averages will now be clear ; 

it is to express a complex group by a few simple numbers. The 

The fnnotion of niind cannot grasp the magnitudes of millions of 

averagoB. items at once ; they must be grouped, simplified, 
averaged. The averages chosen must be those which will give 
the striking features and the essential characteristics of the group. 
DiflTerent methods will apply to groups of various classes ; each 
must be taken on its own merits. A good and suitable average 
has the following characteristics : — If there is a type it shows it ; 
it gives due influence to extreine cases ; it is not easily affected by 
errors or much displaced by slight alterations in systems of calcu- 
lation ; and it is easily calculated. 

The relative positions of the difiTerent kinds of averages dealt 
with gives some information as to the general nature of the group 
to which they refer. The arithmetic average, median and mode, 
are close together, if the group is symmetrical. The arithmetic 
average is probably above the median, if we have a small group 
at a high degree. The arithmetic average is generally below the 
median, if there is an absence of high numbers, and a concen- 
tration a little above the average. The mode will be badly 
defined, if our group is not homogeneous. The mode will pro- 
bably be below, the arithmetic average, if there is a small group 
at a high degree. The mode is well marked, if the distribution is 
uniform. These rules are only tentative and easily nullified by 
exceptional circumstances. 



CHAPTER VI. 

SOME EXAMPLES OF THE USE OF 
AVERAGES IN TABULATION. 



CHAPTER VI. 

SOME EXAMPLES OF THE USE OF AVERAGES IN 

TABULATION. 



to tnlB wrrioa 



If our analysis of the nature and use of averages is complete, 
Avpuofttioii of and if averages are of widely extended use, we 
ftTmcM should now be able to express almost any group 
of figures by a few well-chosen numbers of definite significance. 

To apply a somewhat severe test at first, let us choose 
a familiar example from ordinary life, and consider how a 

suburban business man might test the merits of 
two railway systems, by one of which he intended 
to take a season ticket 

The following table gives the train service between Leather- 
head and London in 1898 : — 

■ 

Train Service — Leatherhead to London. 

Number of Minutes to Journey. 

Waterloo— 

iVnrff— 60, so, 52, 48, 47, 61, 50, 44, 48, 53, 45, 42, 45, 49, 43, 48, 42, 43. 
Sundays—^o, 50, 47, 49, 5a 

£>-5i» 46, 5i» 48, 43. 44. 48, 48, 64. 45. 48, 47. 45. 47. 4^, 47- 
Sundays— ^t 4*. 5'. 5^ S'* 
London Bridge— 

ZhwH—ey, 6s, 6s. 6i, 74, 51, 56, 66, 65, 53, 59. 41. 49, 44, 58, 57, 56, 67, 80. 

Sundays— 67, $2, 66, 68, 88, 6s, 6s, 68, 6j. 
^-^. 57. 53. 58, 54. 41. 58. 52, 42, 40, 55. 67, 79. 98. 69, 66, 68, 64, 71. 



Victoria — 



Dawn— 77, 6s, 55. 76, 77. 88, 48, 53, 46, 69, 89, 54, 82, 71, 9a 
Sundays—^!, 45, 81, 84, 78, 61. 85, 83, 8s. 

£>-87. 6s, 69, 69, 47, 48, SI, 83, loi, s8, 62, 61, 76, 103. 
SundaysSi, 76, 80, 8s, 8s, 82, 94. 

The following table gives us the necessary information : — 



Average of four quickest trains • 
Lower decile . - - - 

Median 

Mode 

Number of trains on week days- 
General average 



London 
Bridge. 


Victoria. 


Waterloo. 


Min. 


Min. 


Min. 


41 


46J 


42t 


474 


48 


42 


65 


77 


48 


65 


• • • 


48 


38 


29 


34 


63 


73 


48 



/ 



134 
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It is to be noticed that the statistical method is generally 
limited to one aspect of a problem ; the question of punctuality 
might, indeed, be easily treated statistically, but the questions 
of comfort and relative picturesqueness of route will elude our 
analysis. 

The next example shows a method of throwing into relief 
the characteristics of a typical group of sociological data. 

The adjoining table gives the wages recognised by the 
TaboiAtion of Amalgamated Society of Engineers in many of 
wages rotunii. ^heir branches in 1862 and 1891. 

Amalgamated Society of Engineers. — Wages in 1862 and 1891, 

Weekly, exclusive of Overtime. 







1862. 


189X. 




Z862. 


1891. 






J. 


d. 


5. 


d. 




J. d. 


s. d. 


Accrington 


- 


■ 27 





31 





Faversham 


■ 34 


33 


Ashford • 


- 


- 33 


6 


30 





Folkestone 


- 34 


32 


Ashton-under- 
Bacup 


Lyne 


- 29 
• 26 


3 

I 


34 
28 






Frome 


- 24 


27 
30 


Barrow-in-Furness 


- 31 





34 


9 


Gainsborough - 


- 27 6 


28 


Barry 




■ 29 





31 





Glossop - 


- 27 2 


32 


Bath- 




- 27 





29 





Gloucester 


- 28 


32 


Bilston 




- 28 





30 





Grantham - 


- 28 6 


30 4 


Bingley - 




- 24 





29 





Grimsby - 


. 28 


32 


Birkenhead 




- 29 





35 


6 


Halifax - 


■ 23 I 


31 


Birmingham 




■ 32 





36 





Hanley 


- 28 3 


32 


Blackburn 




- 27 


6 


32 





Hartlepool 


• 26 


34 10 


Bolton 

• 




- 27 


6 


i28 
132 






Heywood - 


- 27 


/30 
l34 


Bridgwater 
Brignton - 




- 24 


6 


24 





Holyhead - 


- 32 


28 




. 24 


8J 


29 





Huddersfield 


- 26 


26 


Bristol 




■ 31 





32 





Hull - 


- 27 6 


34 


Burnley 




- 27 





30 





Hyde 


30 
28 


30 


Burton-on-Trent 


- 25 





30 





28 


Bury 


- 


- 28 


3 


/30 

132 






Ipswich - 
Keighley - 


- 28 6 

- 23 


28 
27 


Cardiff - 




■ 31 





34 





Kidderminster - 


• 28 


30 


Carlisle - 




■ 24 


6 


30 





Lancaster - 


- 25 


32 


Chepstow - 




- 30 





34 





Leeds 


- 25 


30 


Chester - 




- 30 





32 





Leicester - 


- 26 


31 6 


Chowbent - 




- 26 





32 





Leigh 


■ 27 9 


31 6 


Colne 




■ 25 





31 





Lincoln - 


- 26 7 


28 6 


Congleton 




- 24 





28 





Liverpool - 


- 29 


34 


Coventry - 




- 28 





34 





Llanelly - 


■ 22 


26 


Crewe 




- 29 


4 


30 





Macclesfield 


- 24 


29 6 


Darlington 




- 25 





31 


6 


Manchester 


■ 29 9 


35 


Dartford - 




- 34 





38 





Mexborough 


- 27 


32 


Darwen - 




- 27 





32 





Middleslx)rough 


- 25 


34 


Derby 




- 26 





29 





Middleton- 


- 29 5 


33 


Doncaster - 




- 28 


6 


31 


6 


Milton and Elsecar 


- 28 


34 


Dover 




■ 35 


6 


36 





Neath 


- 32 


30 


Enfield Lock 




■ 36 





40 


6 


Newark - 


- 25 


29 


Exeter 




■ 23 





r28 
132 



9 


Newcastle - 


• 25 


J35 
l37 



USE OF AVERAGES IN TABULATION. 



135 







X862. 


1891. , 




Z»62. 


189 


II. 






J. 


d. 


J. 


d. 




s. 


d. 


s. 


d. 


New Holland - 


- 


30 


8 


34 





Stafford - 


• 34 





30 





Newport - 


- 


30 





32 





Stalybridge 


- 28 


3 


32 





New Town (Stockport) 
Newton Abbott - 


29 
33 






32 
33 






Stockport - 


- 28 


« ( 


32 

34 






Northampton - 


• 


26 





32 





Stockton-on-Tees 


- 24 





36 





Northfleet - 


- 


36 





36 





Stoke-on-Trent - 


■ 29 





32 





North and So. Shields 


26 





35 





Stroud and Thrupp 


- 26 





30 





Norwich - 




32 





29 





Swindon - 


- 31 


6 


31 


6 


Nottingham 




27 


5 


34 


•0 


Todmorden 


- 26 





28 





Oldbury - 




28 





34 





Wakefield - 


■ ^5 





30 





Oldham - 




29 





33 





Warrington 


- 25 





34 





Peterborough - 




28 


6 


33 





Watford - 


- 35 





36 





Plymouth - 




32 





33 





Wednesbury 


- 26 





3i 





Pontypridd 
Portsmouth 




24 
35 






30 
34 






Whitehaven 


- 25 


'{ 


28 
36 






Preston - 




27 





32 





Wigan 


. 28 





34 





RaddifTe Bridge 




27 





.30 
132 






Wolverhampton 
Wolverton 


- 28 

- 29 




2 


33 
29 






Reading - 




28 





J32 
134 






Worcester - 
Bermondsey 


- 31 

■ 35 




4 > 


30 





Ripley - 




26 





26 


6 


Blackwall - 


• 34 









Rotherham 




27 


6 


32 





Bow - - 


- 36 









Rugby 




32 





r28 
132 






Greenwich 
King's Cross 


- 34 

- 36 










Rugeley - 




24 


II 


30 





Lambeth - 


- 35 


8 






St Helens- 




28 





/34 
136 






London, £. 
,. N. - 


■ 35 
- 35 



10 


38 





Sheffield ■ 




28 





36 





if 0. 


- 35 









Shipley - 




25 


9 


r28 
130 






„ w. - 

Marylebone 


- 35 
-33 


6 








Shrewsbury 
Smethwick 




^0 
28 


6 



32 
35 






Stratford - 


./35 
133 



6 






Southampton - 




32 





34 


6 


Tower Hamlets 


- 36 


6 






Sowerby Bridge 




24 


6 


30 





Woolwich - 


- 36 


^ 







The following figures show the same in brief: — 





I. 


a. 


3- 




i862.» 


x89x.» 


i89x.t 
J. d. 




I. d. 


s, d. 


Maximum .... 


36 6 


40 6 


• • « 


Upper decile - - - - 


35 


38 


38 


Upper quartile 


H ^ 


34 


36 


Median 


28 


32 


34 3 


Arithmetic average - 


28 10 


32 4 


33 4 


Modes 


28 


/ 30 
I 32 


• ■ • 


Lower ouartile 

Lower decile - - . - 


26 


30 


31 6 


24 6 


28 6 


30 


Minimum .... 


22 


24 


■ • • 



* Each branch counting as I. 

t The numbers of members in each branch counted as receiving 
th^ wage recognised there. 
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If the rates at each branch were not those actually paid to 
all members, but their average, while the actual wages were 
confined within small limits of that average, the figures in the 
last column would be little affected. 

On comparing columns i and 2 it will be seen that not 
only have all the averages increased, but that since the lower 
decile and quartile have increased more rapidly than the upper, 
the lower half has also gained on the upper. Again the wages 
are grouped more closely in column 2 than in column i. 

It is important to choose a simple measure of the disper- 
sion of a group that can be easily appreciated and calculated, 
Meunre of that varies with sufficient sensitiveness with change 

dispemon. j^ dispersion, and can be applied generally in com- 
parison of group with group. The following satisfies all these 
conditions : express half the distance between the quartiles as a 
fraction of the arithmetic average ; this fraction measures the 
dispersion. For the above figures this quantity is — 

^*- ^' =.092 in 1862, and — ?^ =.062 in 189 1. 



28s. lod. * ^ ' 32s. 4d. 

The dispersion, therefore, diminished in that period. A more 
satisfactory and complete measurement, of which this is an 
adaptation, is discussed in Part II. 

Group C of Tabulation. — It was necessary to postpone 
the tabulation of non-numerical or descriptive answers till we 

Tabulation of ^^^ finished our discussion of averages. We have 

doforiptiTe now seen that the median and allied quantities can 

"*'^^"* be applied in many unexpected ways ; and the 

following detailed example shows how they can be used to give 

a short description of a large group of adjectival answers. 

In 1 89 1 the Amalgamated Society of Engineers obtained 
from all their branches answers to the question : To What extent 
is overtime worked ? The branch secretaries sent answers which 
may be tabulated as follows : — 

Answers. 

None - - . - 

Not worked - 
Very little 

To very limited extent 
Very occasionally 
A little on repairs 
Little - 



Number of 


Number of 


Branches. 


Members. 


4 


140 


I 


78 


23 


4.836 


I 


63 


I 


350 


I 


500 


2 


73 
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Answers. 

2 hours when necessary 

Seldom 

Small extent - - - 

Seldom except on repairs 

Only on repairs 

Not much 

On repairs 

Not to any extent 

Not to a great extent - 

Not general - 

Not systematically 

In cases of breakdown or emergency 

2 hours regularly 

Chiefly on repairs 

Occasionally - 

When necessary 

Casually (sic) - 

A good deal on repairs 

Maximum 18 hours in 4 weeks 

Moderately 

Systematically in good trade - 

Average about 5 hours a week 

Considerably in marine shops 

Systematically in dockyard 

General 

Systematically 

Great amount - 

To a great extent 

Excessively 

9 hours a week 

10 „ - - 
12 „ (maximum) - 
14 „ (when busy) - 
10 to 18 hours a week - 

Total - 
Unclassed : — 

No answers - - - 

As little as possible - 

Not so much lately - 

In machine shops for six months 

In steel works - - - - 



Number of 


Number of 


Branches. 


Members. 


I 


80 


I 


59 


I 


16 


I 


66 


2 


216 


6 


1,125 


I 


500 


3 


644 


2 


162 


I 


7 


2 


43 


7 


606 


I 


136 


I 


20 


2 


90 


I 


348 


2 


142 


I 


23 


I 


1,000 


3 


262 


I 


200 


I 


96 


I 


400 


I 


650 


2 


146 


I 


693 


I 


263 


I 


72 


I 


550 


I 


39 


I 


106 


I 


700 


I 


106 


I 


5,000 


88 


20,666 


36 


5»"4 


I 


250 


I 


160 


T 


60 


I 


348 
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An inspection of the table here given will show sufficiently 
the method of tabulation. The position of most of the answers 
Expianauon of 1" an imaginary scale is fairly definite, except that 
*»wa it is not always obvious where the numerical 
answers should be placed ; this must be decided either by internal 
evidence or practical knowledge of the trade. The same adjec- 
tives did not of course convey exactly the same numerical 
meaning to all the branch secretaries who used them, but it will 
be admitted that this tabulation gives a fairly clear view of the 
case, and that the method of medians and quartiles may be 
appropriately applied. Taking the member of a branch as the 
unit and neglecting the unclassed answers, the median is 
"Maximum 18 hours in 4 weeks" or "moderately," the lower 
quartile "Very little," and the upper quartile "14 hours when 
busy." Taking the branch as unit, the median is " Not much," 
the quartiles are "Very little" and "When necessary" or 
" Occasionally." 

This method, which, with varying degrees of precision, is 
widely applicable, seems to afford the only way of comparing 
two such groups of answers; The precision attainable is to be 
measured by the distance through which the median can be 
shifted by making reasonable variations in the scheme of 
tabulation. 

Summarisation. — Now that we have the method of averages 
at our disposal we may use it for tabulating and summarising a 
group of figures. 

Consider, for example, the answers to the questions issued 
by the Commissioners on Trade Depression in 1886. 

Four of the questions were : — 

1. Number of men in Society. 

2. Number out of work in 1885. 

3. Weekly wage in 1885. 

4. Change in wages between 1865 and 1885. 

The following table shows the answers given by the branch 
secretaries of the Amalgamated Society of Engineers : — 
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I. 

District. 



Belfast - 
Coventry 

Dukinfield 
Dundee 



Glasgow . - ' - 

Glasgow (St Rollox) 

Hartlepool • 

Glossop 

Liverpool 

Monifieth 



Nottingham 

Oldham 

Oxford- ' 

Paisley - 

Preston 

Preston 

Shipley 



Sowerby Bridge 
Sunderland - 

Swindon 
Ulverston 
Wednesbury - 
Workington - 



3. 


3' 


No. in 


No. Out 


District, 


of Work, 


1885. 


1885. 


1,100 


130 


2,500 


230 


170 + 


20 + 


1,400 


457o 


28,000 


4,000 


1,600 


250 


1,200 


400 


13s 


10 


280 


38 


114 


18 


4,000 


600 


1,600 


96 


45 


• • • 


800 


• • ■ 


630 


40 


900 


120 


201 


15 


X,I20 


43 


3.200 


400 


6,050 


2 


45 


• • • 


400 


30 


170 


70 



Cuient 

Wages, 

Z885. 



28/ to 36/ 

31/6 

31/ 

25/ skilled. 
1 5/ unskilled. 

26/ 



31/6 

32/ 



21/ 



34/inininiuni. 

33/ average. 

33/ 
28/6 

28/ 
28/ 
28/6 
94/ noa>ttiiionists. 

28/ 

33/ 

31/6 
31/ 

28 to 36/ 



Change between 1865 and 1885. 



Slight increase. 

Contract work— 50 % ^^* 

crease. 
Slight increase. 
Time work — 1865, 22/ ; '72, 

24/ ; '80, 26/ ; '83, 24/ ; 

'85. 25/. 
Time wages, 5 /^ above 

1864. 
Rise in 1872-73 of 15 7^; 

1885 same as 1865. 
Advance of 3/. 



Rise in 1872.73 of 7i 7o ; 

1885 same as 1865. 
Skilled work— 1865, 24/; 

»76, 27/ ; '78, 25/ ; '83, 

28/ ; '85, 25/. 
1865,28/; 1885,34/. 
Increase of 5 7o' 



1865, 26/ ; 1885, 28/6. 

None. 

None. 

1865, 28/6; 1869.73. 32/; 

1885, 28/6. 
1865.75,25/6; 1875.85,28/. 
1864, 27/ ; '74. 34/ ; 1875. 

85, between 31/ and 37/. 



1865, 26/ ; 1875, 31/. 
Increase of 2/. 
Increase of 30 7o' 



It is suggested that the following are the summary tables 
which should be inserted in a report dealing with the answers. 

The figures are given here for only one society, but the 
tabulations are framed so as to include all. 

TABLE I. — State of Employment. 



Name of Society. 


Total Number • 

in Branches 
making Returns 
on Employment. 


Number Out of 
Work. 


Percentage Out 
of Work. 


Median of the 

Percentages Out 

of Work in the 

Various Branches. 


A.S.E. 
O.S.B. 

&c. 

1 


55,170 

• 


7,142 


13 


12 



* Details of some of the most important branches should be added. 
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TABLE II.— Current Wages. 



1 

Name of Society. 


Average of Wages in Branches. 


Quartiles of 
Branch Wages. 


Measure of Dis- 
persion (9. p. 136). 


Unweighted. 


Weighted. 


A.S.E. 

O.S.B. 

&C. 


«. d. 
30 


*. d, 
29 7 


X. d. s. d. 
28 32 


A 



TABLE III. 
A. Change of Wage between 1865 and 1885. 



Name 

of 
Society. 


Number of Branches showing 


Median 
of Per- 
centage 
Increases. 

10 


Percentages of Members in Branches 
showing 


No 
Answer. 


De- 
crease. 


No 
Change. 


Increase. 


No 
Answer. 

II 


De. 

crease. 


No 
Change. 


Increase. 


A.S.E. 
O.S.B. 

&c. 


4 


I 


5 


13 


4 


6 


79 



Verbal Summary, — In the great majority of cases a con- 
siderable increase of wage took place between 1865 and 1885, 
equivalent on the whole to a rise of about 10 per cent. The 
figures are not sufficiently definite to give an exact average. 

Table III. — B, Change of Wage between 1865 and the 

Maximum about 1873. 

Table Ill.-r-C Change of Wage between Maximum about 

1873 AND 1885. 

(Tabulation as in III. A^ 
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CHAPTER VII. 
THE GRAPHIC METHOD. 

I. General Purpose. 

The two main methods of elementary statistics which ought to 
be understood by all students or officials who handle figures, 
which are easily within the grasp of all independently of mathe- 
matical training, but are generally misunderstood or ignored by 
the uninterested or the uninitiated, are the method of averages 
and the method of diagrams or the graphic method. These two 
are placed together because the uses of averages and diagrams 
are nearly related. When we deal with large and complex 
▲▼aragM and masses of figures we are unable to grasp them in 
*"*«^*™'- their entirety, however clearly they may be tabu- 
lated. Any list of figures — the populations of different towns, 
the death-rates at successive ages, the wages of many work- 
people, the imports for a series of years — becomes less compre- 
hensible as its length increases. A series of ten numbers can, 
perhaps, be easily grasped, of twenty only with an effort ; while 
a printed list of figures for one hundred successive years leaves 
hardly any impression on our mind at all ; we cannot see the 
wood for the trees. The test to which all questions as to the 
use of averages should be referred is that the averages selected 
should afford the best summary of the whole group in question 
that the mind can grasp. When the meaning of the word 
average was sufficiently extended, we found that we could select 
three, four, or even ten suitable figures which adequately showed 
the main features of any group. The main use of diagrams is / 
also to present large groups of figures so that they shall be " \ 
intelligible in their entirety, and the test for all diagrams is that 
the diagram as drawn should afford the best view of the series' 
or group of figures that the eye can appreciate. Diagrams have 
one use which averages have not, for it is only by a diagram that 
a series of figures relating to successive years can be adequately 
presented ; but in reality they are less essential than averages, for 
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the latter often have an existence independently of the figures 
from which they are derived, representing true types of the 
quantities which are being measured; and by their use alone 
are further comparisons of complex groups made possible : while 
, diagrams, on the other hand, might be dispensed with, being 
auxiliary rather than essential, merely an aid to the eye and 
a means of saving time. 

To connect this chapter more closely with the preceding, we 
Gmdiio ^^^^ show how the same group of figures, for 
xepraMntatkni example the wages of a large group of workpeople, 
of ETwagM. ^^y. j^ represented by either method. 

Consider the following data : — 



Numbers of workpeople earning- 



From 15/ to 16/ - 200 

16/ „ 17/ - 400 

17/ „ 18/ - 100 

18/ „ 19/ - 100 

19/ » 20/ - 200^ 

-20/ ,,21/ - 200^ 

21/ „ 22/ - 300 

22/ „ 23/ - 300 

23/ II 24/ - 500 

24/ » 25/ - 900, 



From 25/ to 26/ - 1,200^ 



om 25/ to 20/ - I,2oo^ 

„ 26/ „ 27/ - 800 

-1,000 „ 27/ „ 28/ - 7oof3»5oo 

„ 28/ „ 29/ - 500 

i> 29/ „ 30/ - 300J 

» 3<^/ >^ 3^1 - 3oo\ 

» 31/ 11 32/ - 400 

-2,200 „ 32/ „ 33/ - 400 

»» 33/ ,» 34/ - 500 

» 34/ » 35/ - 500J 



-2,100 



From 35/ to 36/ - 600' 

» 36/ ,, 37/ - 400 

» 37/ » 38/ - I00]'I,200 

11 38/ „ 39/ - 80 

n 39/ >i 40/ - 20 

Using the method of averages we should replace this group by 
the following figures : — 



s. 


d. 


27 


6 


17 





36 


6 


27 






Average of all 

„ lowest 1,000 

„ highest 1,000 . - - - 

„ middle 4,000 . . . . 

or 

Median, 26/9; quartiles, 24/2, 32/. 

Deciles, 20/, 23/6, 24/9, 25/8,, 26/9, 28/2, 31/, 33/4, 35/4. 

Mode, 25/3 ; secondary positions, 16/6, 36/. 

or 

Persons earning frdm - 15/ to 20/ 20/ to 25/ 25/ to 30/ 30/ to 35/ 35/ to 40/ 
Percentages of all - - 10 22 35 2X 12 




f^^ ^4S' 
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This group is represented on the annexed diagram, an 

example of the graphic representation of the relation between 

Q^jjj^^y^j^j^j^ two variable quantities. A figure similar to this 

ofiimiiie may be used to show birth, marriage, or death 

"♦ftcnvmf rates at different years, numbers of persons of 
various statures, demand at different prices, or any such group 
of homogeneous quantities. The same construction can be 
used to show the changing values of any number in a series 
of years. Draw a line parallel to the bottom of the page, and 
mark equal intervals to represent a quantity which can have 
many successive small increments, such as age, income, height, 
price, time, and so on. This is called the axis of abscisscB^ 
and the distance of a point measured from the zero position 
along the line is called its abscissa. At right angles to this 
line, parallel to the side of the paper, through the zero position 
we draw another, called the axis of ordinates^ and grade this 
to correspond to the numbers possessing the qualities repre- 
sented by the abscissae ; at each grade on the axis of abscissae, 
draw lines at right angles to it, to represent on the chosen scale 
the numbers at that grade ; these lines are called the ordinates. 
In the annexed diagram the abscissae represent the amounts 
of wages, the ordinates the number of persons earning them. 
Join the tops of the ordinates by straight lines and the diagram 
is complete. In practice, when squared paper is used, without 
drawing the ordinates their tops can be marked. 

This diagram shows at one glance the distribution of the 
wage-earners according to their wages. A small number earned 

D6Nriiitton between 15s. and i6s., a slightly larger group 
of the wage between i6s. and 17s., very few between -i/s. and 

****""* 19s. Above 19s. the number continually rises ; 
high numbers are found from 24s. to 27s., the highest between 
25s. and 26s. The line falls to the 30s. group, but not so low 
as between 17s. and 19s., then it rises regularly to 36s., and 
falls rapidly to 39s. Here, then, we have the main group 
congregated in the neighbourhood of 25s., a distinct but smaller 
group at 36s., and a small and nearly isolated group at i6s. ; 
representing a considerable group of highly-skilled men between 
30s. and 40s., the great mass with ordinary skill between 20s. 
and 30s., and a small group of incompetents at i6s. These 
features would not be so easily seen from the tabulated figures. 

It is to be noticed that the number tabulated as between 

K 
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15s. and i6s. is represented by the ordinate at 15s. 6d., the 
middle of the interval ; if the original figures on which the 
table was based had been given to the nearest id., the ordinate 
should be drawn at 15s. S^d.* It is important that these middle 
points should be accurately placed. 

The use of the line joining the tops of the ordinates is two- 
fold. First, it enables the eye to judge relative heights more 

easily ; and secondly, it suggests the idea of con- 
tinuity, which can be better illustrated by the next 
diagram. In this the abscissae represent ages, the ordinates 
the estimated numbers of persons living at and above the ages 
at which they stand per million inhabitants of England and 
Wales at the middle of the year 1891. The ordinates were 
drawn at the points on the axis of abscissae representing the 
middle of each year of age; but length of life cannot be 
expressed exactly in years, or even in months, days, or minutes. 
The intention of the diagram is to show the proportion living 
above each age, and for this purpose the joining line should 
have no breaks or sharp angles, but should suggest absolute 
continuity. 

In practice, it is useless to mark in the points for smaller 
intervals than a year, for the eye could not grasp the detail. 
It is, however, implied that the line drawn has the same shape 
as that which would result if the number of persons was infinite 
and the subdivision by age infinitesimal. 

Estimated number per 1,000 of the population at and above — 



Ages. 




Ages. 


Ages. Ages. 




Ages. 







1,000 


16 


628 32 346 49 


152 


65 


47 


I 


973 


17 


607 33 332 50 


143 


66 


43 


2 


949 


18 


587 34 318 51 


135 


67 


38 


3 


925 


19 


567 35 305 52 


127 


68 


34 


4 


901 


20 


547 36 292 53 


119 


69 


31 


5 


577 


21 


528 37 280 54 


112 


70 


27 


6 


854 


22 


510 38 268 55 


104 


71 


24 


7 


830 


23 


491 39 256 56 


98 


72 


21 


8 


807 


24 


474 40 244 57 


91 


73 


18 


9 


783 


^1 


456 41 233 58 


85 


74 


15 


10 


760 


26 


439 42 222 59 


79 


75 


n 


II 


738 


27 


423 43 211 60 


73 


76 


II 


12 


715 


28 


407 44 201 61 


67 


77 


9 


13 


693 


29 


391 45 «9i 62 


62 


78 


8 


«4 


671 


30 


376 46 181 63 


57 


79 


6 


15 


649 


31 


361 47 171 64 
48 161 

Calculated from the Census of 1891, 


52 

■ 


80 


5 








* See p. 88, supra. 
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Age*. 



SO 



Apply these remarks to the diagram facing p. 145. Average 
earnings for a year will not be reckoned exactly by shillings 
or even pence ; if we had a sufficient number of instances we 
should get regular sequences of earners at successive farthings, 
and the line representing them would have no sharp angles, 
but be continually curved. The figure rightly gives the eye 
this impression of continuousness. Similarly in the diagram 
representing exports facing p. 151, the line correctly gives the 
impression that exports are continuous day by day. 

By an obvious step we may suppose that the unit of area, that 
contained between vertical lines through two consecutive divisions 
on the axis of abscissa, and horizontal lines through 
two consecutive divisions on the axis of ordinates, 
represents one wage-earner, and it is then easy to see that the 
area contained between the base line, the curve, and two vertical 
lines through the points marking any two amounts of wage re- 
presents the total number earning rates between those amounts. 
Hence the lines (see p. 145) through M, the position of 
the median, Q^, Q, those of the quartiles. Dj, Dj, Dg, D,, D,. Dj, 
Dg, D, of the deciles divide the area ABm^m^m^CD into two, 
four, and ten equal areas respectively. The centre of gravity 
of this figure lies on the vertical line through V, the average 
wage; and »;, «j, «, the feet of the ordinates through the 
highest points m,, pip m, are at the modes. 
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The details of technique of diagram drawing, the position 
of the scales, the devices for' making the figure clear, and so 

Boqninte on, can be gathered from the various diagrams 

■~™*^- given in this chapter. The degree of accuracy to 
which the figures should be marked, whether correct to a 
million, a thousand, or a unit, is determined simply by the 
power of the eye to grasp detail ; in most of those here given 
it will be found that a displacement of one in a thousand is 
perceptible, and this is the ordinary limit. More minute accuracy 
is useless, for it is not the function of diagrams to dispense 
with lists of numbers, but only to enable the eye to perceive 
their significant features. 

Before discussing the choice of scales on which the numbers 
are to be represented, it is necessary to consider the ways in 

which a diaE:ram makes an impression on the eye. 
The eye can judge — (i) Distances; (2) ratios; (3) 
angles. The dotted lines in the diagram facing p. 1 5 1 will illus- 
trate these points. ( i .) The eye is a fairly safe judge of distances ; 
there is very little doubt which of two points is the further 
from the base line ; when squared paper is used, a difference 
of I in 1,000 is perceptible. The eye can also judge differences 
quickly. In the figure the value of the exports in 1883 exceeded 
that in 1885 by more than the value in 1890 exceeded that in 
1883. (2.) It can be quickly seen that the value of exports 
doubled between 1862 and 1889; or that the value in 1878 is three- 
quarters of that in 1890. The accuracy with which the eye can 
make such measurements is not great ; it is not easy to detect 
that the ratio of the values in 1873 and 1871 (1.095 • i) is greater 
than the ratio of the values in 1882 and 1880 (1.073 • I ^^^ ^^e 
general impression given by the diagram is partly made up by 
unconscious calculations of this nature. To make these obser- 
vations accurately the method described on pp. 188-9 should be 
used. Notice that for these observations the insertion of the 
base line is necessary ; and, because they are made unconsciously, 
there are very few cases where a diagram without a base line 
gives a correct impression. (3.) The question, Was the increment 
greater in 1887-88 or in 1888-89? can be more quickly answered 
by observing the angles than by noting the differences. The 
line showing the latter change is steeper (makes a greater angle 
with the horizontal) than the line showing the former. Hence 
the latter increase is the greater ; actually ;£'i4,400,ooo against 
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£i2jSoofiOO. The most useful exercise of this power, however, 
is to judge the dates at which the rate of increase changed ; thus 
the value of exports increased in 1862-63, increased at a slower 
rattf in 1863-64, and slower yet in 1864-65, more rapidly in 
1865-66; a slow fall followed in 1866-67, then an increase began 
which is continually accelerated to 1871, and so on. The line 
from 1872-76 is concave to the base line, showing an accelerated 
fall ; the concavity from 1879 to 1882 corresponds to a retarded 
rise ; at 1888 convexity gives place to concavity, for at that date 
the rate of increase b^an to diminish. 

It is difficult to lay down rules for the proper choice of the 
scales by which the figure should be plotted out. It is only the 

ohoiotof ratio between the horizontal and vertical scales 
«»ie. that need be considered. The figure must be 
sufficiently small for the whole of it to be visible at once ; if the 
figure is complicated, relating to a long series of years and vary- 
ing numbers, minute accuracy must be sacrificed to this con- 
sideration. Supposing the horizontal scale decided, the vertical 
scale must be chosen so that the part of the line which shows 
the greatest rate of increase is well inclined to the vertical, 
which can be managed by making the scale sufficiently small ; 
and, on the other hand, all important fluctuations must be 
clearly visible, for which the scale may need to be increased. 
Any scale which satisfies both these conditions will fulfil its 
purpose. The annexed page shows the erroneous impressions 
which can be given by a judicious manipulation of the scale 
and by the omission of the base line. The diagrams, which 
are drawn roughly, all represent the same estimates of wages in 
England and in the United States of America for certain years 
from i860. Figure i sets the lines in proper relief In figure 2, 
Nooenityof ^^^ hdise line is not drawn in the zero position 

oorrMt for the English scale, and the American scale is 
reduced ; the consequence is that English wages 
appear to have fluctuated widely, while American made steady 
progress. In figures 3, 4, and 5 the scales are doctored and the 
base line adjusted, so that in 3 American wages seem to have 
caught up English, in 5 exactly the reverse is the case, while in 
4 wages appear to have moved with equal rapidity in both 
countries. An examination of these figures will show that the 
eye cannot be trusted to supply the right base line, or to 
estimate the importance of fluctuations without it ; and, with 
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certain exceptions to be mentioned later,* it is well to distrust 
all those numerous diagrams, where space has been economised 
at the expense of the base line. 

Total Declared Real Value of British and Irish Produce 
Exported from the United Kingdom, i =;^i,ooo,ooo. 







Averages. 






Averages. 1 


Three 


Five 


Ten 


Three 


Five 


Trn 






Yearly. 


Yearly. 


Yearly. 






Yearly. 


Yearly. 


Yearly. 


1855 


95.7 


• • • 


• • ■ 


• • a 


1878 


192.8 


197.4 


210.9 


218.0. 


1856 


115.8 


• ■ • 


• • • 


■ • • 


1879 


191.5 


194-4 


201.4 


218. 1 


1857 


122.0 


III. 2 


• ■ • 


• • • 


1880 


223.1 


202.5 


201.3 


220.5 


1858 


1 16.6 


1 18. 1 


• fl • 


• • ■ 


1881 


234.0 


216.2 


208.2 


221.6 


1859 


130-4 


123.0 


1 16. 1 


• • • 


1882 


241.5 


232.9 


216.7 


220.1 


i860 


135.9 


127.6 


I24.I 


• • ■ 


1883 


239.8 


238.4 


226.0 


218.6 


I86I 


125. 1 


130.5 


126.0 


• ■ * 


1884 


233.0 


238.1 


234.3 


217.9 


1862 


124.0 


128.3 


126.4 


• > • 


1885 


213. 1 


228.6 


232.3 


216.9 


1863 


146.5 


I3I.9 


132.4 


• • • 


1886 


212.7 


219.6 


228.0 


218. 1 


1864 


160.4 


143.7 


138.4 


127.2 


1887 


221.9 


215.6 


224.1 


220.4 


1865 


165.8 


157.6 


144.4 


134.3 


1888 


234.5 


223.0 


223.0 


224.5- 


1866 


188.9 


171.7 


157.2 


141. 6 


1889 


248.9 


235.1 


226.2 


230.2 


1867 


181. 


178.6 


168.7 


147.5 


1890 


263.5 


249.0 


236.3 


234.2 


1868 


179.7 


183.2 


175.1 


153.8- 


1891 


247.2 


253.2 


243.2 


235.5 


1869 


190.0 


'l^'t 


181. 


159.8 


1892 


227.1 


245.9 


244.2 


234.1 


1870 


199.6 


189.8 


187.8 


165.9 


1893 


218.1 


230.8 


240.9 


231.9 


187 1 


223.1 


204.2 


194.6 


175.7 


1894 


215.8 


220.3 


234.3 


230.2 


1872 


256.3 


226.3 


209.7 


188.9 


1895 


225.9 


219.9 


226.8 


231.4 


1873 


255.2 


244.9 


224.8 


200.0 


1896 


240.1 


227.3 


225.4 


234.1 


1874 


239.6 


250.4 


234.7 


207.9 


1897 


234.3 


233.4 


226.8 


235.4 


1875 


223.5 


239.4 


239.6 


213.7 


1898 


233.4 


235.9 


229.8 


235.3- 


1876 


200.6 


221.0 


235.1 


214.9 


1899 


255.4* 


241 


237.8 


236.1 


1877 


198.9 


207.7 


223.7 


216.7 













* Not including the newly reckoned value of ships exported. 

We can now pass on to the consideration of the smooth- 
ing of curves, for which purpose the question of the " alleged 

smootiiizig stationariness of our exports," discussed by Sir R. 
o'*'^^ Giffen in his paper before the Royal Statistical 
Society in 1899, affords an excellent illustration. The thin 
dotted line on the diagram opposite shows the value of exports 
year by year, and the first impression given by it is that exports 
have not grown in value in recent years. Sir Robert Giffen 
gave the following table : — 

Average Annual Value of Exports. 

1855-57 ;^i34iOoo»ooo 

1865-67 228,000,000 

1875-77 264,000,000 

1885-87 274,000,000 

1895-97 292,000,000 



* See pp. 188-194, in/ra. 
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and from this he deduced " that all through there is an increase, 
and that the only sign of stationariness is an increase at a less 
rate in the last periods than in the earlier periods." 

The Saturday Review^ wrote "that such a conclusion is 
grossly misleading," for the figures are merely triennial averages 
of selected years showing a happy coincidence ; " why was not 
1898 included?" An inspection of the numbers does not show 
us the answer to this criticism, but on the diagram the whole 
circumstances are visible at a glance. Since 1865 three great 
waves have been completed. The maximum of 1872, due to the 
inflated prices of that year, is very high, but that of 1890 is 
greater than any previous figure, while the maximum in 1882 is 
comparatively low. The minima increase throughout; those 
of 1868, 1879, 1886 show a regular progression, which falls off 
greatly in 1891. In 1894-96 it lopked as if another decennial 
cycle was in progress, but this has been checked. Since the 
discussion the returns for 1 899 show an increase which brings the 
figure for that year very near the maximum of 1872. 

The Saturday Review went on to ask why Sir Robert Giffen 
did not give " proper quinquennial averages," such as — 

Average Annual Value of Exports. 

1870-74 ;£235»000»000 

1880-84 234,000,000 

1890-94 234,000,000 

1898 233,poo,ooo 

and it must be granted that this gives an appearance dia- 
metrically opposite to that of the previous table. 

It is clear that we need some general method of bringing 
these figures into a form which shall be quite independent of the 
choice of any special years. The diagram facing page 151 does 
this. The thick continuous line, lying almost over the dotted line 
of annual values, shows triennial averages taken yearly, that 
is the average of each year with those before and after it ; this 
line smooths off the corners without affecting the general appear- 
ance. The line of crosses shows quinquennial averages, each 
year being averaged with the two previous and two subsequent 
years. The line of circles shows decennial averages; each circle 
is placed at the centre of the period whose average it represents ; 



* January 1899, pp. 66, 67. 
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thus the circle showing the average of the ten years 1875-84 is 
placed vertically over the line separating the years 1879 and 
i88o.* 

On looking at the line of quinquennial averages it is clear 
that the Saturday Review did precisely what it accused Sir 

ohoioeof Robert Gififen of doing, for years are taken which. 

periodi. favour the argument The quinquennial periods 
selected for comparison with 1898 are all on the upper parts 
of the waves, the marks showing these averages are very near 
the maxima of the quinquennial line, while the year 1898 does 
not appear to be a maximum. We might with just as much or 
as little accuracy give the following : — 

Quinquennial Averages of the Values of Exports. 

1865-69 ;^i 8 1,000,000 

1875-79 201,000,000 

1885-89 226,000,000 

1898 233,000,000 

and say that the value in 1898 was higher than any of the pre- 
vious selected averages. There is no need to use arbitrary dates 
to get at the facts. No argument can stand which does not take 
account of the cycle of trade, which is not eliminated till we 
take decennial averages. Special marks in the diagram show 
the averages for 1859-68, 1869-78, 1879-88, 1889-98, and indicate 
a rapid increase before 1870, and a steady slower progress since. 
The complete line gives just the same general appearance. If, 
finally, the figures were completely smoothed by a freehand line 
keeping as close to this as was possible, without making sudden 
changes of curvature, the same appearance would be given ; the 
thick line on the diagram is an attempt to do this. The smooth- 
ing is obtained by the assumption that the cycle of trade is ten 
years ; when two maxima fall within the same ten years the 
average of this period by our construction gives the appearance 
of a maximum {e.g.^ in 1887) at a date of a minimum. This 
would be avoided if we continually changed our period for 
averaging to accommodate the changing wave-length, a some- 
what arbitrary proceeding. The difficulty thus arising can be 
easily corrected by the eye, and the final smoothed line is 
intended to convey this corrected impression. 

* In all the curves of averages the mark showing the average is placed at 
the centre of gravity of the marks showing the 3, 5, or 10 quantities averaged. 
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It should be clear now that it was in 1899 five years too 
soon to pay attention to the particular figure for 1 898 ; the 
figures for the next five years, necessary to determine the char- 
acter of the coming wave, could not be foretold. When the figure 
for 1899 (not represented on the diagram) is included, the new 
decennial average (1890-99)15 the highest on record, while the 
actual value for the year 1899 has only been exceeded in 1890 
and 1873. It will be seen, moreover, that the sentence quoted 
from Sir Robert Giffen on p. 152 is fully justified. 

The smoothed line now constructed represents the general 
tendency of the value of exports, when accidental and tempo- 
Meaningof r^iry variations are removed. If it were possible 
imooth lino, to separate entirely variations of short period from 
secular changes, to separate the ebb and flow of the tide of 
commerce from the steady current of increasing trade, we may 
suppose that we should obtain a result represented by this line. 
In it there are no sudden changes even in rates of growth, while 
the addition and subtraction year by year of relatively small 
quantities would produce precisely that irregular fluctuating line 
from which the smooth line was obtained. 

The fuller discussion of "smoothing" series of figures be- 
longs to the chapter on interpolation, but one other group may 
Smoothing a ^^^^ ^ considered, as showing the use of the 
homogeneoiu graphic method for obtaining regularity out of 
*"^^ irregular raw material Referring back to the 
figures given on p. 120, the wages of 5,000 workers can be 
expressed anew by a diagram, in which the ordinates represent 
the numbers earning at or above a certain wage. The thin 
angular line on the adjacent page represents these numbers, 
entered for every lo-cent group. This plan is especially useful 
for irregular figures, like this wage-group, for the line must 
always tend upwards from the numbers earning the highest 
wage to the numbers earning at least the lowest. The diagram 
is also at once adaptable to the graphic method of finding the 
median described on p. 127. 

The irregularities shown by the thin line do not arise from 
any law of wage-grouping, but are due to the accidertts t3f obser- 
vation ; if we regard these returns as samples out of a much 
larger unregistered group, we may suppose that a smoothed 
curve will indicate approximately the form which would be 
obtained, if our returns were complete. To smooth this figure, 
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draw a freehand line passing as near the points as possible 

without abrupt changes of curvature, as in the annexed diagram. 

A new approximation may be made for the median, quartiles, 

Graphic mtthod ^^•» ^X drawing horizontal lines through the points 

of flxiding the on the vertical scale corresponding to half, one- 

"* quarter, three-quarters, &c., of the workers ; from 

the points where these cross the smooth line, draw vertical lines 

to the scale of dollars ; the points on the scale so obtained are 

the median (quartile, &c.) wage. 

The results obtained are : — 





Median. 


Quartile. 


Quartile. 


Given on p. 92 


$1.49 


• • • 


• • • 


By method of p. 128, used 








in annexed diagram 


$1.49 


$1.16 


$2.12 


From smooth curve in an- 








nexed diagram - 


$1.51 


$1.15 


$2.13 


By method of interpolation. 








P- 253 - 


$1,536 


• • • 


• ■ • 



This method is not, however, one of great precision ; a very slight 
change in the curvature of the smoothed line would make more 
difference than those shown between the second and third lines 
in the above table. 

This method is more useful for determining the mode. It 

will be remembered that the difficulties in doing this before 

Graphic method ^i^^se from the uneven distribution on the two 

of flnding the sides of the mode, and in the displacement of the 

mode by the adoption of a second system of 
tabulation. The first of these difficulties entirely disappears 
in the graphic method, while the second is diminished, for 
the displacement now only depends on the slight possible 
variations in the curvature of the smooth line. The mode is 
clearly the position where the greatest number is added, in the 
present method of representing the figures : that is, the mode is 
where the line, angular or smooth, is steepest. On the smooth 
curve the maximum steepness is where the tangent crosses the 
curve, — in mathematical language, at a point of inflexion. This 
can be determined mechanically by placing a ruler to touch the 
curve, and turning it round the curve till it crosses it On the 
annexed figure this occurs in the interval between $1.10 to $1.40. 
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A more complex method of determining both mode and median, 
is discussed in Chap. X. 

This graphic way of finding these averages has two great 
advantages. It can be applied to numbers which are given 
at irregular intervals of graduation (^.^., 30 at 30s. 6d., 40 at 
30s. 8Jd., 35 at 40s. id., &c.) as easily and by exactly the same 
construction as to more regular returns ; and if the smooth curve 
is carefully drawn, the number of modes can be seen at a glance 
and the individual importance of each can be estimated. In 
the annexed diagram, the curve is concave to the base line from 
$.30 to about $1.20, convex from about $1.20 to $3.15, concave till 
$3.40, and then convex till the end. The points of inflexion or 
the modes are where concavity gives way to convexity. Hence 
there are two modes, of which that near $3.4 is of the less 
importance. The mathematical method of pp. 252-4 shows them 
to be at $1.10 and $3.20. 



A large class of diagrams may be passed by with a few 
words. Writers and lecturers frequently use points, lines. 

Pictorial triangles, squares, circles, even pictures, of diffe- 

^***8™™^ rent sizes to assist the presentation of the rela- 
tive magnitude of numbers. These have their use for popular 
lectures and hand-books, but do not add anything to the signi- 
ficance of the figures. Collections of these may be found in 
the second volume of Gabaglio's Teoria Generate delta Statistical 
and in M. Levasseur's La Statistique Grapkique in the Jubilee 
Volume of the Royal Statistical Society. 

Of these one group may be signalled as of practical use. 
Rectangles may be used to express three quantities : one side 
to represent price ; the adjacent side, quantity ; and the area, 
value : or number of houses, average number of inmates and 
population : or number of hours' work per week, average output 
or hourly wage, and total output or weekly wage. The figures 
on the annexed page show the limit to which this method can 
be usefully pushed. 
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Representation of Three Facts bv Reciangles. 
Imaginary budgets of an artisan and a labourer, shoving amounts 
spent weekly on various commodities, and number of houre' work 
necessary for each amount. 



The horizontal scale 
represents pence per hour. 
.125 inch a id. 

The vertical scale re- 
presents number of hours 
per week. . i inch ^ 2 hours. 

The areas represent 
amounts spent, and the 
whole rectangles show the 
week's wages on the same 
scale. I sq. in. ai3s. 4d. 



The use of statistical maps needs only a brief notice. Any 
numerical quality of a population, its density, average income, 
f^^^^^ average taxation, may be shown district by district 
by Suitable markings, or colours. Of these the 
most useful method is to choose one colour, say blue, for 
excess above the average ; another, say red, for defect. Divide 
the districts in nine groups, say more than 7 per cent., S to 7 
per cent, 3 to 5 per cent, l to 3 per cent, above the average : 
these should be marked by four shades of blue, becoming lighter 
as the average is approached ; within i per cent, of the average, 
above or below, should be white ; and shades of red, gradually 
becoming darker, will show the remaining grades below the 
average. Care must be taken not to adopt too many grades. 
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For examples of this method see Booth's Life and Labour of 
the People^ maps ; the Statistical Atlas of the Xlth Census of the 
United States ; the Statistical Atlas of India ; and the maps in 
M. Levasseur's paper just mentioned. A cheap and very effective 
method, by which similar results are obtained in black and white 
only, may be seen on Plate P (misprinted 2) in that paper, and in 
the excellent chapter on Graphic Representation in Bertillon's 
Cours ^Umentaire de Statistique^ p. 133 seq. 
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2. Historical Diagrams. 

Perhaps the chief use of diagrams is to afford a rapid view 
of the relations between two series of events. 

The different cases that occur are best illustrated by examples. 
The simplest is when we wish to compare two sets of figures 
oompftrifon of expressed in the same unit, say £ sterling ; and 
^*«"*« the simplest of these when we wish simply to com- 
pare a whole and its parts. 

On the adjacent diagram the upper line shows the annual 
total gross revenue {Statistical Abstract^ p. 9) ; the next line, that 
luuitnted by P^^^ which comes from inland revenue and customs, 
tii« rsTMiiM. the difference being mainly composed of post office 
and telegraph receipts. The principal heads of revenue are 
customs, excise, income tax, and post office. These are shown 
by suitable lines for each year, each line being independent of 
the other, and all having the same base line and being on the 
same scale. This method is greatly preferable to the alternative 
one of drawing a second line representing the total less customs, 
a third the total less customs and excise, and so on, because the 
eye is then quite incapable of judging the relative movements of 
the separate items. The figure shows at once the main features of 
the course of revenue. The increase has been rapid but irregular. 
The growth in the Crimean War was too rapid to be at once 
maintained, but the figures for the 6o's are at a far higher level 
than those for the 50's. A rapid fluctuation in 1870 is followed 
by a more regular growth almost unchecked till 1887 ; and then, 
after a short stationary period, there is a great increase in 1895. 
These remarks apply almost without alteration to the line show- 
ing inland revenue and customs. If we look for the parts of the 
revenue that have borne the increase and change, we see that in 
the whole period receipts from excise have increased most, next 
those from the income tax, and next those from the post office, 
while the customs have diminished. Each line has its distinc- 
tive features. The post office payments show an almost regular 
growth. The income tax fluctuates violently, bearing the brunt 
of nearly all the rapid changes in the total, especially in 1856 
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Revenue of the United Kingdom. 

Unit, ID all columns, ;f 10,000. 





Total 
Revenue. 


Inland 

Revenue 

and 


Customs. 


Excise. 


Property 
and Income 


Post and 
Telegraph. 


• 




Customs. 






XmX. 

560* 




1850 


5,739 


5,431 


2,226 


1,497 


216 


I85I 


5,732 


5,412 


2,204 


1,528 


560* 


228 


1852 


5,658 


5,335 


2,222 


1,538 


550* 


237 


1853 


5,753 


5,401 


2,214 


1,575 


570* 


237 


1854 


5,890 


5,502 


2,251 


1,630 


580* 


252 


1855 


6,282 


5,944 


2,163 


1,680* 


1,070* 


237 


1856 


7,026 


6,601 


2,324 


1,730* 


1,520* 


281 


1857 


7,279 


6,848 


2,353 


1,840* 


1,620* 


292 


1858 


6,788 


6,309 


2,311 


1,782 


1,159 


292 


1859 


6,548 


5,987 


2,412 


1,790 


668 


320 


i860 


7.109 


6,570 


2,446 


2,036 


960 


331 


I86I 


7,028 


6,514 


2,331 


1,943 


1,092 


340 


1862 


6,986 


6,412 


2,367 


1,833 


1,036 


351 


1863 


7,060 


6,390 


2,403 


1,715 


1,057 


3^5 


1864 


7,021 


6,306 


2,323 


1,821 


908 


381 


1865 


7,031 


6,291 


2,257 


1,956 


796 


410 


1866 


6,781 


6,036 


2,128 


1,979 


639 


425 


1867 


6,943 


6,156 


2,230 


2,067 


570 


447 


1868 


6,960 


6,204 


2,265 


2,016 


618 


463 


1869 


7,259 


6,422 


2,242 


2,046 


862 


466 


1870 


7,543 . 


6,708 


2,153 


2,176 


1,004 


477 


I87I 


6,994 


6,106 


2,019 


2,279 


s 


527 


1872 


7,471 


6,484 


2.033 


2,333 


543 


1873 


7,661 


6,660 


2,103 


2,578 


750 


583 


1874 


7.734 


6,608 


2,034 


2,717 


569 


700 


1875 


7,492 


6,397 


1,929 


2,739 


431 


679 


1876 


7,713 


6,525 


2,002 


2,763 


411 


719 


1877 


7,857 


6,636 


1,992 


2,774 


5?^ 


730 


1878 


7,774 


6,610 


1,997 


2,746 


582 


746 


1879 


8,115 


6.899 


2,032 


2,740 


. 871 


757 


1880 


7,934 


6,695 


1.933 


2,530 


923 


777 


I88I 


8,187 


6.895 


1,918 


2,530 


1,065 


830 


1882 


8,396 


7,058 


1,929 


2,724 


994 


863 


1883 


8,739 


7,313 


1,966 


2,693 


1,190 


901 


1884 


8,616 


7,187 


1,970 


2,695 


1,072 


947 


1885 


8,799 


7,380 


2,032 


2,660 


1,200 


966 


1886 


8,958 


7,493 


1,983 


2,546 


1.516 


989 


1887 


9,077 


7,611 


2,015 


2,525 


1,590 


1,028 


1888 


8,980 


7,566 


1,963 


2,562 


1,444 


1,060 


1889 


8,847 


7,360 


2,007 


2,560 


1,270 


1,118 


1890 


8,930 


7,341 


2,042 


2,416 


1,277 


1,177 


I89I 


8,949 


7.358 


1,948 


2,479 


1,325 


1,226 


1892 


9,099 


7,534 


1,974 


2,561 


1,381 


'•^ol 


1893 


9,040 


7,480 


1,971 


2,536 


1,347 


1,288 


1894 


9,113 


7,543 


1,971 


2,520 


1,520 


1,301 


1895 


9,468 


7,865 


2,011 


2,605 


1.560 


1,334 


1896 


10,197 


8,512 


2,076 


2,680 


1,610 


1,422 


1897 


10,395 


8,597 


2,125 


2,746 


1,665 


1,477 


1898 


10,661 


8,855 


2,180 


2,830 


1,725 


1,518 


1899 


10,834 


8,945 


2,085 


2,920 


1,800 


1,586 



* These figures cannot be given accurately within jCioo,ooo. 
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and 1870. The fall in this line in 1872-76 is counterbalanced by 
the rise in excise ; while the excise line shows stationariness till 
1870, a sudden jump to 1874, ^rid a very slow decline since that 
date. Customs, on the other hand, have to some extent taken an 
opposite course to that of excise, so that the total from the two 
has not changed very rapidly. There is a very marked station- 
ariness since 1871. At the top of the page a new base line is 
taken, and the number of pounds per head of the population is 
shown year by year ; it will be seen that the only important 
increase was between 1851 and 1857, and that since i860 the 
fluctuations have been slight. 

So far we have found no more difficulty in the choice of 
scales than previously when dealing with only one line, for all 
ohoioeof ^^6 lines on the larger diagram indicate millions 
saoondioaie. of pounds, and when the unit is £iy a new base 
line has been adopted. But we may need to show the change 
of population on the larger diagram. It is necessary, as 
we have already seen, to use the same base line for the two 
quantities to be compared ; but we may choose any point for 
the beginning of the new line, adapting our vertical scale, for the 
eye can judge the proportionate changes wherever the line is 
placed. It is best to decide this point by defining the problem 
on which the comparison should throw light. If it is required to 
compare the growth of revenue with the growth of population 
since, say, 1850, we should start the new line at the point on 
the 1850 line where the revenue curve begins, and we can then 
see how the lines intersect one another again and again. Since 
1850, however, is an arbitrary date, this plan lacks definition, 
and it is more logical to make the lines coincide at the most 
recent date given, with which any previous date can then be 
compared. The plan adopted on the diagram given is another 
alternative ; the line is drawn on such a scale that it lies fairly 
close to that for inland revenue throughout the greater part of 
its course. 

The next diagram, facing p. 164, introduces further diffi- 
culties as to the choice of scales. The object of the figure is to 
oomi»arUo]iof show the relations between quantity, value, and 
quantity and price of imported wheat, and population. The line 
▼aina. ^ .^ ^^^^ drawn on a scale chosen so as to throw its 
fluctuations into relief Population is at once brought into rela- 
tion with this by calculating the amount per head year by year. 

L 
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The line C to represent these figures is drawn on a different 
scale, chosen so that the line shall not cause confusion by con- 
tinually crossing any of the others on the figure. If the figure 
was too full this could be treated as on p. i6i, the revenue per 
head. The same scale of years must be used, and for simplicity 
of calculation and appearance, lOO lbs. consumed per head is 

Details of measured by the same vertical distance as 10,000,000 
oonitraotioB. ^^^ imported. A and C refer to the same quan- 
tities, and therefore similar lines are used in both cases. The 
line B represents value and is shown by a broken line. For 
this line the choice of scale is more difficult. In the diagrams 
which follow, instances will be shown where special methods are 
used to bring out specific comparisons. Here this is not neces- 
sary, and a scale is adopted which brings the lines A and B into 
near relation, and shows the fluctuations of B, while the figure is 
made simple and intelligible by the representation of ;^20 by the 
same vertical distance as 20 cwt 

The line D shows the changing price of wheat. The scale is 
chosen so that it boldly crosses the lines A and B ; thus its 
fluctuations are clearly shown, and the numbers are easily seen, 
for 2s. per cwt. is represented by the same vertical line as 
10,000,000 cwt. If the figure was accurately drawn, lines A and 
D would lie one over the other in 1876-77 ; they are therefore 
shifted ' very slightly horizontally, and clearness is preserved 
without the general impression being vitiated. 

This line B shows some very interesting facts. Its chief 
characteristic is excessive fluctuation ; while a smoothed line 
Hiatorioaiteots would show an upward tendency till 1878 and a 
uiiutratad. f^n since that date. The fluctuations are the result 
of a great number of causes : an increasing population, the 
fact that wheat imported is only complementary to the home 
product, which is dominated by the English weather, the varia- 
tion of harvests all over the world, political events, the fall in the 
value of silver, the development of means of communication and 
transport, and all the other causes which determine price. Notice 
how all these are indicated by this single line. The upward 
tendency till 1875 shows an increasing population ; a deficient 
home harvest is shown by the rise in 1871-73, a world-wide defi- 
ciency by the fall in 1880, a good home product by the fall in 
1875. The American Civil War is marked in 1865, the general 
improvement in transport by the rise before 1875, the fall of 
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prices by the fall since 1878. These various causes, however, 
often tend to neutralise one another. 

Importations of Wheat and Wheat Flour, 1862 to 1898. 





A. 


B. 


C. 


D. 




Total 


Total Value 


Quantity re- 


Average Price of 


Year. 


Quantities 
Imported. 


Imported. 
Unit. 


tained per 


Wheat and 




Head of the 


Wheat Flour in 




Unit, 
100,000 cwt. 


;£ 100,000. 


Population. 


Shillings per cwt. 








lbs. 




1862 


500 


286 


185 


11.44 


1863 


3?2 


155 


112 


10.03 


1864 


288 


135 


104 


9-37 


1865 


258 


124 


93 


9.61 


1866 


294 


168 


104 


11.43 


1867 


391 


285 


140 


14.58 


1868 


365 


249 


130 


13-64 


1869 


444 


233 


156 


10.50 


1870 


369 


196 


123 


10.62 


187 1 


444 


268 


151 


12.07 


1872 


476 


303 


163 


12.73 


1873 


516 


344 


171 


13.33 


1874 


493 


309 


162 


12.53 


1875 


595 


324 


197 


10.89 


1876 


519 


279 


168 


10.75 


1877 


635 


407 


202 


12.82 


1878 


597 


342 


187 


11.46 


1879 


730 


400 


228 


10.95 


1880 


685 


393 


209 


11.47 


1881 


713 


407 


217 


11.42 


1882 


808 


449 


242 


II. II 


1883 


851 


438 


252 


10.30 


1884 


669 


301 


192 


9.00 


1885 


823 


337 


238 


8.19 


1886 


670 


261 


188 


7.79 


1887 


802 


3H 


224 


7.82 


1888 


804 


315 


223 


7.82 


1889 


789 


3" 


219 


7.88 


1890 


824 


327 


226 


7.94 


189I 


895 


396 


244 


8.85 


1892 


956 


371 


245 


7.76 


1893 


938 


308 


248 


6.57 


1894 


967 


268 


256 


5-54 


1895 


1,073 


302 


285 


5.63 


1896 


996 


309 


257 


6.21 


1897 


887 


330 


228 


7.44 


1898 


944 


377 


238 


7.99 



As regards the choice of markings for different lines, the 
chief rule is that lines which cross one another, unless very 
ohoioe of acutely, must be marked differently. The second 
markiBgi. rule jg to mark similar quantities in similar ways. 
Thus in the next diagram the lines representing quantities 
have a resemblance to one another, as have also those showing 
values ; while the two lines relating to imports are distinguished 
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The Cotton Trade, 1854-98. 



Year. 


Piece Goods Exported. 


Raw Cotton Imported. 
1 


l*riceper 








Quantity. 


Value. 


Qtiantity. 


Value. 


Cwt. 




000.000's 
omitted. 


coo's omitted. 


ooo's omitted. 


ooo's omitted. 






Yards. 


c 


cwts. 


L 


£, 


1854 


i»693 


25,055 


7,923 


20,175 


2.55 


1855 


1,938 


27,579 


7,962 


20,849 


2.62 


1856 


2,035 


30,204 


9,142 


26,448 


2.89 


1857 


1,979 


30,323 


8,655 


29,289 


3-38 


1858 


2,324 


33,422 


9,235 


30, 107 


3.26 


1859 


2,563 


38,744 


10,946 


34,560 


3.16 


i860 


2,776 


42,142 


12,419 


35,757 


2.88 


1861 


2,563 


37,580 


11,223 


38,653 


3-44 


1862 


1,681 


28,562 


4,678 


31,093 


6.65 


1863 


1.711 


37,634 


5,983 


56,282 


9.41 


1864 


1.752 


43,917 


7.983 


78,219 


9.90 


1865 


2,014 


44,876 


8,737 


66,041 


7.56 


1866 


2,576 


57,903 


12,299 


77,530 


6.30 


1867 


2,832 


53,128 


11,276 


52,003 


4.61 


1868 


2,977 


50,265 


11,864 


55,194 


4.65 


1869 


2,869 


49,922 


11,907 


56,847 


4.77 


1870 


3,267 


53,348 


11.95? 


53,478 


4.47 


187 1 


3.417 


53,643 


15,876 


55,907 


3.52 


1872 


3.538 


58,931 


12,579 


53,381 


4.24 


1873 


3,484 


56,493 


13,639 


54,705 


4.01 


1874 


3,607 


55,023 


13,990 


50,696 


3.62 


1875 


3,562 


53,627 


13,325 


46,260 


3-46 


1876 


3.669 


50,378 


13.284 


40,181 


3-03 


1877 


3,838 


52,442 


12,101 


35,421 


2.93 


1878 


3,619 


48,104 


11,968 


33,520 


2.80 


1879 


3,725 


46,875 


i3,"9 


36.181 


2.76 


1880 


4,496 


57,678 


14,542 


42,772 


2.94 


1881 


4.777 


59, 104 


14,992 


43.835 


2.92 


1882 


4,349 


55,443 


15,930 


46,655 


2.93 


1883 


4,539 


55,534 


15,485 


45.042 


2.91 


1884 


4,417 


51,666 


15,618 


44,486 


2.85 


1885 


4,375 


48,277 


12,731 


36.473 


2.86 


1886 


4,850 


50.172 


15,313 


38,128 


2.49 


1887 


4,904 


51,742 


15.995 


40,156 


2.51 


1888 


5.038 


52.582 


15.462 


40,009 


2.59 


1889 


5.001 


51,388 


17,299 


45.642 


2.64 


1890 


5,125 


54,160 


16,013 


42,757 


2.67 


1891 


4,912 


52,432 


17,811 


46,081 


2-59 


1892 


4,873 


48,766 


15,850 


37.888 


2.39 


1893 


4,652 


47,282 


12,650 


30,685 


2.43 


1894 


5.312 


50,219 


'5»965 


32,944 


2.06 


1895 


5.033 


46,759 


15,688 


30,429 


1.94 


1896 


5,218 


51,196 


15,669 


36,272 


2.31 


1897 


4,792 


45,808 


15,394 


32,195 


2.09 


1898 


5,216 


47,910 


19,005 


34.126 


1.80 
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HISTORICAL DIAGRAMS. l6S 

from those relating to exports. If it is possible to use more 
than one colour this principle can be easily carried out* 

This diagram is intended to show the relations between the 
quantities and values of cotton imported and exported during 
The hiitoiy of forty years. The vertical scale for values is chosen 
the ootton trade, go as to bring the whole figure to a convenient size 
and to mark the fluctuations. The value of the raw cotton im- 
ported is increased, perhaps trebled by manufacture, and of the 
finished product a large part is used at home, the rest exported. 
The excess of the value of exports over imports therefore re- 
presents the increment of value due to manufacture (that is the 
total earnings or the wages and profits of the cotton industry), 
less the total value of all cotton goods sold at home. When the 
exports are less in value than the imports, the earnings of manu- 
facture are less than the home consumption : when equal, equal. 
Looking at the diagram, it will be seen that value of exports 
of piece goods exceeded that of general imports from 1854 to 
i860, though often by only a small margin; that the reverse 
was the case during the cotton famine in 1861-66, when extrava- 
gant prices were paid for raw cotton to partially supply the 
'• home market at a high price, while the export fell off. Equality 

was again attained in 1867, while since 1 871 exports have 
greatly exceeded imports in value, the difference being perma- 
nently established since 1879. It would appear that the home 
market is saturated, while the foreign market has extended. 

The line representing the value exported may be described 
in a few words : a general and rapid increase took place from 
V 1850 to 1866, interrupted only by the Civil War ; since 1866 the 

fluctuations have been violent, but the general average stationary. 
The effect of the Civil War is well emphasised by all the lines 
here, and is clear also in the diagram facing p. 164. With a 
• little experience in the use of diagrams these lines may be 
smoothed by the eye alone. 

The unit of the quantity of imports is 1,000,000 cwt. of raw 

cotton ; one-tenth of this can be distinguished in the figure. In 

ohoioeofioaie 1 854, 7,993,000 cwt. were imported, and their value 

for quantiiies. ^as ;^20, 175,000. If we represent 2 cwt. by the 

same vertical length as £^y as done in the figure, the lines begin 



♦ See IVa^es in the Nineteenth Century^ by the present author, diagram 
p. 90. 
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at practically the same point. Adopting this scale, we are able 
to see at once the divergence of quantity from value during the 
period. 

In the year 1891, 17,81 1,000, cwt. valued at ;f 46,08 1,000, were 
imported. The sum would have bought 18,096,000 cwt. at the 
price of 1854, a difference of only ij per cent; so that it hap- 
pens in this case that the value and quantity lines are nearly 
together again in 1891. The actual course of prices is shown 
by the lowest line on the diagram. In 1862 quantity falls more 
quickly than value as price rises, and as the supply recovered 
in 1866 value went up before and more violently than quantity, 
owing to the high price. In 1869 quantity rose while value fell, 
but otherwise the lines fluctuate together and continually tend 
towards each other. 

The study of quantity and value in exports is more inter- 
esting. It is not obvious what commodity is the best repre- 

Hutoryof sentative of cotton exports. In 1895, S,ooo,ooo,ooo 

oxportB. yards of piece goods valued at ;^46,700,ooo, 81 2,oco 
pairs of stockings valued at ;^220,ooo, 23,800,000 lbs. of sewing 
thread valued at ;6^3,ooo,ooo, and 250,000,000 lbs. of cotton yarn 
valued at ^^9,200,000, were exported. A good plan, perhaps, 
would be to take so many yards of piece goods as equivalent to 
so many pounds of yarn, the relative prices being the criterion, 
and to add these together to determine the quantity ; in the 
figure, however, piece goods only are taken. 

In this case there is no simple relation between quantity and 
value at the first date, and there is no simple method of making 
the two scales correspond. Having marked the value line on the 
squared paper in use, it was necessary to draw out a new system 
for the quantities. In 1854, 1,693,000,000 yards were valued at 

^25,055,000. Then -^^- yards corresponded to £i, i.e., 67.7 

yards to a unit ; and each number of yards had to be reduced to 
this scale. This is done in practice quite easily by a mechanical 
scale, by which numbers can be automatically reduced in any 
required ratio. The scale is then entered to the right-hand 
side of the figure. It is of course not easy to read the exact 
numbers off the figure, but it can be done with the help of 
a ruler. To avoid this difficulty, the actual amounts can be 
entered on the diagram at critical places. But after all it is 
not the object of the diagram to make it possible to read the 
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numbers ; the object is to show the relative rises and falls, and 
the steepness, and to allow comparisons of the lines. The figures 
should be taken from a table. No scale is in reality necessary, 
except for the process of drawing the lines. 

The history of quantity of exports of cotton (and of other 

textiles) is quite different from that of values. Value fluctuates 

Quantity and and shows very little rise since 1866. Quantity 

^^^^ fluctuates, but not greatly, except at the Civil War, 

and except in the '6o's, and in '80-6 shows a general rise. The 

smoothed curve would rise throughout. 

One important cause of this difference is, that, as Sir R. Giffen 
has pointed out, a large sum should be in reality deducted from ex- 
port values to allow for the import value of the raw cotton, before 
any conclusions are drawn as to the progress of British manu- 
facture. Now, as we have seen, the price of raw cotton has 
fallen very fast during precisely the period (since 1865) that 
export value has not grown. A greater corresponding deduction 
should be made in the earlier years than in the later, which would 
result in a definite, if fluctuating, rise in the period. This would 
not make values increase so fast as quantities ; the difference is 
due to the general causes of the fall of price of manufactured goods. 
By looking carefully at the diagram it will be seen that the quantity 
line approached that of value, when the price was falling in 1866-68, 
and fell away again with the higher price -of 1872; after 1872 
the quantity line gets nearer again, and crosses the value line in 
1875, when the price was the same as in 1850; since 1875, as 
prices fell, the divergence has steadily increased. 

It must be admitted that a study of the diagram repre- 
senting these figures leads much more rapidly and safely to 
many interesting conclusions than the table on p. 164 of the 
figures themselves. 
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3. Comparisons of Series of Figures. 

A. Before proceeding to the study of the next diagram, it will 
be well to define more exactly what is our object in comparative 
studies of figures, and to consider the means at our disposal. 

When dealing with two series of similar quantities such as 
the course of trade or population in two countries, we wish to 

Qn»Bitain see the general rate of progress (to be done by 

oomparisonB. smoothing the curve), the years of special increase, 
the dates of maximum and minimum, in fact to compare the three 
things that the eye can see — the increase, the rate of increase, 
and the dates of change of rate of increase. The most obvious 
way to do this is, to take the same scale and base line for both 
countries and the same unit of measurement ; but this method 
does not take us all the way. We can judge differences, it is 
true, and the additions in all the years in both countries, and we 
can see the highest and lowest points and dates of change of 
rate of increase ; but we cannot compare rates of increase. 
It is not easy to judge ratio, though a rough guess at it is 
possible. Thus if the trade is very different in magnitude in 
the two countries, equal absolute increments will mean very 
different relative increments, and it is difficult to be always on 
one's guard. 

The remedy for this is to alter the arrangement of scales. 

Make a second figure, in which the unit shall be not a sum of 

Peroentage money, but a percentage : let i per cent, of Eng- 

■oaieB. land's trade, say in 1850, be the unit for the 
English line ; and i per cent, of the trade of Germany, at the same 
date, for the German line. In other words, express the trade of 
both countries as percentages of their value in a given year, 
and draw lines to represent these percentages. Alongside the 
diagram two or more scales can be placed showing the absolute 
amounts of the trade of each country. Then the rates of 
increase will be comparable, equal increments representing equal 
percentages of the trade of each country ; and, in addition, the 
dates at which either country gained ground relatively to the 
other can be easily picked out. The question as to whether 
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absolute rates or relative rates should be studied is a very com- 
mon one in statistics. Sometimes the absolute magnitude 
AbMinw or should be known, as for instance when we want to 
nsMvt estimate the effect of measures which will affect 
""*"***■ the well-being of special classes, or the trade of 
special countries ; sometimes the relative rate, as when we want 
to watch the progressive increase of different industries, or to be 
on our guard as to future competitors. The two studies gene- 
rally require two different diagrams though they may represent 
the same numbers. 



It will be seen that the chief difficulty lies in the choice of 
the year in which the quantities are to be equated ; this must 
be decided by the nature of the argument which the diagram 
is to illustrate. 



We may compare the following figures — 



Year - 


18S0 


1890 


1900 


A - 


220 


440 


330 


B - - 


.60 


240 


400 



in three ways, thus : — 

I. Expressed as percentages of values 



Scales 
% A. B. 

200 440 330 

150 330 240 



. Expressed as percentages of values 



160 
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3. Enpressed as percentages 
of values in 19CK1. 
Scales 



In figure 3 the fluctuations are seen as percentages of the 
values at the last date, and are thrown into better proportion 
than in figure i. It is frequently the case that the equating of 
quantities at the most recent date throws what are often small 
beginnings into their right proportion when viewed from the 
modem standpoint. The statements that the values in 1880 
were 60 and 67 per cent, respectively of the corresponding 
present values, is in better perspective than the statement that 
the values in 1900 were 250 per cent, and 150 per cent, of the 
corresponding values in 1880; but circumstances must decide 
in each case which method is to be adopted. 

These points are fully illustrated by the annexed diagrams, 

the object of which is to analyse the progress of our trade with 

niiiitntiou °^^ colonies and with foreign countries, especially 

troni trads with Germany. The first figure shows the total im- 

'^"°"^- ports and exports, and the parts of each which 
are colonial and foreign, the scale in millions of pounds being 
the same for all the lines. A line is also given for imports from 
Germany, Holland, and Belgium ; these are grouped together, 
because it is not possible to distinguish in the returns from the 
two latter home manufactures from German goods in transit. 
It is not clear from this diagram which part of our imports has 
increased most rapidly. The three lines are, therefore, redrawn 
in the second diagram, on a percentage scale, all the values 
being expressed as percentages of the corresponding values in 
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1898. It is now seen that imports from foreign countries 
and from our colonial possessions and India have marched 
t<^ther except during the period of the cotton famine, but the 

Imports and Exports, 1862-1898. 
Unit in all columns, ^£100,000. 



Tot»1 Expons Exponi Imporu Jinpont from 

TotaJ Ex|»r» lo lo tiam rrom Gcmunv 

npons. Including Riiiish ForciEn Briiiih Fareiin Hollud 



trade from Germany has increased more rapidly than either. If 
wc had equated the quantities in 1862, the German Hne would 
have faroutpasscd the others by i8y8 ; but the impression given 
would be erroneous as regards absolute quantities, for the 
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increase was only ;£^5o,6oo,Ooo for the one, while it was 
£i 10,500,000 for the other. The remaining diagram shows the 
relative rates of increase for Germany, Holland and Belgium, 
and the British possessions respectively, since 1870. 

B. Series of figures are often compared graphically with a 

view to discovering or illustrating causal relations. In such 

oaixBAi cases we do not only study relative growth as 

rautions. j^ the last diagram discussed, but look throughout 
the period for any signs of resemblance in rates of growth, dates 
of maxima and minima, or synchronism in any changes. The 
methods by which such comparisons are made are difficult, and 
need careful analysis. For instance, we may wish to show that 
an increase of the allowance for outdoor relief is connected with 
an increase of pauperism. In this case one line will represent 
money, the other the number of persons, and there is no common 
unit ; we need not calculate percentages, but having chosen any 
scale for money, we can make equality in any year by a simple 
adaptation of the scale for number. We shall wish to establish 
first, that an increase or decrease of money occurred at, or just 
before, an increase or decrease in number ; and secondly, that 
the greater the increase of one the greater the increase of the 
other. In order to show direct connection, we shall try to make 
one line lie as nearly as possible over the other. 

Draw a preliminary diagram in which both lines are entered 

on any scales ; this will suggest the resemblances to be tested. 

« . ., Notice in what period the fluctuations are greatest : 

CtonstruotiozL , , ^ o ' 

this in general should be the period to be taken, 
for it is here that the causal relations have had most play. 
If any other period is chosen for any special reasons, these 
should be made clear, for otherwise a critic may legitimately 
object that it is only in this period that the connection is 
distinct. There would be little difficulty in finding short 
periods in any two curves where the fluctuations synchronised. 
Take the averages of both money and of number over the 
period chosen, and draw a second diagram in which the scale 
for number is chosen by making this average for number equal 
to the corresponding average for money. Any correspondence 
between the two lines can be at once detected. 

The process just described is completely carried out in the 
first two diagrams comparing the marriage rate and foreign trade 
facing p. 175, 
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There are many cases when the changes in the magnitudes 
which we regard as the causes are inversely proportional — in 

inverM the Opposite sense — to those in the magnitudes 

reiatioDfl. vvhich we regard as the effects. For instance, if 
we are comparing trade improvement with the number of 
unemployed, and make the construction just described, the 
maxima of the first line would synchronise with the minima 
of the second. Greater clearness can be obtained by inverting 
one of the diagrams, plotting out the number employed instead 
of that unemployed, and then the changes should be in the 
same sense in both lines. 

In the above construction the lines will only lie one over the 

other throughout their fluctuations, if the changes in one quantity 

More oompiez are in strict proportion to the changes in the other, 

nutions. jf ^^ increase of 10 per cent, for instance, in the 
allowance for outdoor relief corresponded to one of 10 per cent, 
in the number of paupers. It is very rare that such a simple 
relation is found ; all we can see in general is that the maxima 
and minima occur at the same dates, that the fluctuations agree 
throughout in sense in both series, and that the greater fluctua- 
tions in the one correspond to the greater fluctuations in the 
other. 

Diagrams may often be used to suggest correlation between 

two series of figures, and this indeed is one of their chief merits, 

UMof and they may be used to illustrate arguments on 

diagramfl. ^^ subject, but at this point their utility ends, for 
they cannot be made to prove much. Causal relations are very 
difficult to establish, and the original figures must be critically 
consulted when theories are to be brought to the test. 

We have not yet exhausted the power of diagrams for 

making such comparisons, but the following method must be 

More exact applied only with great caution. Suppose that 

method. ^^ ^jgjj ^Q establish that an increase of i bushel 
in the quantity of wheat to be bought for a sovereign corresponds 
to an increase of 1.5 in the marriage rate per 1,000, or any 
such strict numerical proportion. Draw a diagram representing 
the quantities of wheat, take the average for the period chosen 
for comparison, and write the scale so as to read i, 2, 3 . . . 
bushels above or below the average. Draw no base line. Now 
enter a line to represent the excess or defect of the marriage 
rate from its average in the chosen period, on a scale such that 



174 



ELEMENTS OF STATISTICS. 



1.5 in excess is represented by the same vertical distance as 
I bushel. The closeness of the two lines indicates the validity 
of the theory. The danger of this method is, that with no base 
line there is no possibility of judging the amounts of the changes 
relative to the totals. The insertion of the necessary two base 
lines would confuse rather than aid. 



Marriage Rate, Total Exports and Imports per Head of Popu- 
lation, AND Average Price of Wheat per Quarter. 





Marriage 

Rat* 


Total Exports 


Average Price 


Year. 


and Imports 


of Wheat 




x\.aic« 


per Head. 

£. s. d. 


per Quarter. 






J. d. 


i860 


I7.I 


13 8 


53 3 


1861 


16.3 


13 3 


55 4 


1862 


16. 1 


13 8 


55 5 


1863 


16.8 


15 2 7 


44 9 


1864 . 


17.2 


16 8 7 


40 2 


1865 


17.5 


16 7 5 


41 10 


1866 


17.5 


17 14 5 


49 " 


1867 


I6.S 


16 9 6 


64 5 


1868 


16.I 


17 6 


63 9 


1869 


-.15.9 


17 3 9 


48 2 


1870 


16. 1 


17 10 3 


46 10 


187 1 


16.7 


19 9 6 


56 8 


1872 


17.4 


21 


57 


1873 


• 17.6 


21 4 2 


58 8 


1874 


17.0 


20 II 


55 8 


1875 


16.7 


19 19 4 


45 2 


1876 


16.5 


19 10 


46 2 


1877 


15.7 


19 5 5 


56 9 


1878 


15.2 


18 2 I 


46 5 


1879 


-14.4 


17 16 10 


43 10 


1880 


14.9 


20 3 3 


44 4 


1881 


15. I 


19 17 5 


45 4 


1882 


15.5 


20 8 10 


45 I 


1883 


15.5 


20 13 2 


41 7 


1884 


I5.I 


19 4 I 


35 8 


1885 


14.5 


17 16 9 


32 10 


1886 


-14.2 


17 10 


31 


1887 


14.4 


18 II 7 


32 6 


1888 


14.4 


18 12 I 


31 10 


1889 


15.0 


19 19 9 


29 9 


1890 


15-5 


19 19 7 


31 II 


1891 


15.6 


19 14 


37 


1892 


15.4 


18 15 6 


30 3 


1893 


-14.7 


17 14 9 


26 4 


1894 


15. 1 


17 II 9 


22 10 


1895 


15.0 


17 19 3 


23 I 


1896 


15.8 


18 14 I 


26 2 
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Same base line. 
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hand, now that wheat is cheap and wages higher, a change in 
the price of the loaf is only of great importance to a minority ; 
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It is clear from the preceding analysis that, by the choice 
of scales and base lines, the points at any two dates may be 
made to coincide on any number of accurately drawn lines 
representing series of figures. 

The preceding paragraphs are completely illustrated by the 
adjoining diagram. 

On the left are given lines representing the price of wheat in 

shillings per quarter, the total of values of exports and imports 

mutntion of divided by the population, and the marriage rate 

method. p^r I ^000. The scales chosen are simply those 
which are easiest to use, and throw the lines into proper relief. 
The points in each scale for the same years are over one another, 
but the base lines and scales differ. 

We can see at a glance whether there is resemblance between 

the courses of these^ figures. There is at any rate a general 

icwTiagerata correspondence between the fluctuations of trade 

ftndtnde. a^d of the marriage rate since 1870, and possibly 
earlier. There are points of likeness between wheat prices 
and trade; in 1870-73 both rise together, and fall in 1873-75; 
both rise in 1876-77, fall in the following two years, and their 
rise again; both fall from 1881 to 1886 and then rise. There 
are also many cases in which the motions do not agree, especially 
1862-64, and 1887-89. 

If we look now at the price of wheat and the marriage rate, 

which in the earlier part of the century used to be closely 

Marriage rate related, the one rising when the other fell, we see 

and wheat ^y^^^ there is no great resemblance either in this 
or the contrary sense. In 1860-62 and in 1862-64 wheat rose 
and fell, while the marriage rate fell and rose ; wheat rose in 
1865-67, while the marriage rate was first stationary and then 
fell a little ; then it continued to fall in 1868-70, though wheat was 
falling also ; in 1870-80 the marriage rate shows one long, wheat 
two short, fluctuations. Since 1880, in years in which wheat 
fell, the marriage rate in general fell also and vice versa. 

Let us consider for a moment the possible links of connec- 
tion between these phenomena. When wheat was the chief 

oonneoting object of expenditure of the working class, its 

**"*■• price was the chief thing for them to consider ; 

and so when wheat rose the marriage rate fell. On the other 

hand, now that wheat is cheap and wages higher, a change in 

the price of the loaf is only of great importance to a minority ; 
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it is now the general prosperity of the country, well indicated by 
the condition of foreign trade, that raises the marriage rate. 

When exports and imports are increasing in value, trade is 
stimulated, and in spite of rising prices, marriageable people are 
sanguine that the prosperity will remain and the prices fall ; but 
when the prices fall, so do the profits and incomes, and marriage- 
able people are more prudent. For these reasons we may expect 
the marriage rate and foreign trade lines to resemble each other. 

Now the increase of the marriage rate corresponding to an 
inflation of trade, and an inflation of trade to a time of rising 
prices in general, we shall find the price of wheat in particular, 
which is connected with the course of prices in general, rising 
when trade is inflated and falling when it is depressed, and 
therefore rising and falling with the marriage rate. But since 
the price of wheat is influenced also by special causes, it will not 
always correspond to the state of trade, and still less to the 
marriage rate, with its former tendency to opposite variations. 

There is no need then for surprise that the curves marriage 
rate and trade correspond ; that wheat and trade correspond, 
but less closely ; and that wheat and marriage show a double 
tendency. The correspondence between marriage and trade is 
investigated on the diagram. That between wheat and trade 
should be done on an identical method. Marriage and wheat 
should be compared twice on different plans: first for direct 
correspondence, and then by redrawing the wheat curve with its 
base line at the top for inverse correspondence. 

To effect the comparison between the course of trade and 

the marriage rate, the following steps are taken. On examining 

ooMtniction of ^he two curves on the first figure, it is seen that 

*i*8n^ the resemblance does not begin before 1869 ; 
the parts of the curves since 1869 should therefore be brought 
into close correspondence. The average marriage rate, 1869-94, 
is 15.5, and average imports and exports per head, ;^I9. The 
marriage curve is drawn in the ordinary way ; then with the 
help of a sliding scale the trade curve is put in, so that with 
the same base line ^^19 falls on the 15.5 line. 

The result is that the curves are seen to rise and fall at the 
same dates, but not to the same extent ; for, while the lines 
keep nearly parallel from 1873 to 1879, the falls from the 
maximum being equal, after 1879 the trade line fluctuates further 
above and below its average than the marriage rate does. 
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It remains to test graphically whether the fluctuations are 
YiaMi proportional to one another. The average fluctua- 
eompAiiioiL tions in the two lines must now be equated. 



1869 

1873 
1879 

1882-3 

1886 

1891 

1893 



Marriage Rate. 

Maxima. Minima. Differences. 



17.6 

• • • 

155 

• • • 



14.4 

• ■ • 

14.2 

• • ■ 

14.7 



^1.7 

^3•2 

i.i 

M-3 
M.4 
} .9 



Average of differences - 1.45 



Imports and Exports per Head. 



Maxima. 

£ s. d, 

1867 

1873 21 4 2 

1879 

1883 

1886 

1889 

1894 



Minima. Differences. 
£ s, d, £ s, d, 

16 9 6 



20 13 2 

• • • 

19 19 9 



17 16 10 

• • • 

17 o 10 



17 II 9 



} 



4 14 

3 7 

2 16 

3 12 
2 18 
2 8 



8 

4 

4 

4 
II 

o 



£z ^ z 



/ 



Hence jQ^, 6s. 3d. must be represented on the same scale as h6, \^^ 

This is making the hypothesis that a change of ;6^i in the total 
trade per head synchronises with a change of .5 in the marriage 
rate per thousand. The scales so chosen are marked above and 
below the common average line in the right-hand figure. 

It is now seen that the fluctuations since 1880 lie more 
closely together in the two curves, but that this closeness has 
been obtained by the partial sacrifice of the years 1872-80, and 
there is now a complete disagreement before 1870. A yet 
shorter period, 1879- 1893, would show a very close agreement; 
but so special a selection would vitiate any general argument. 

Our conclusion is, that since 1870 the causes which affect 
foreign trade have also affected the marriage rate at the same 
dates and in the same sense, and that the more marked the 
effects on the one, the more marked are the effects on the other 
also, but that there is no law of simple proportion between them. 

Note, — The relations tested by the middle diagram may be 

X V 

represented by the equation -='t, and that of the right-hand 



a 



x—a 



diagram by 7 = r (a constant), where x and y stand for the value 

of trade and the marriage rate, and a and d for their average 
values, and c is chosen so as to make the average fluctuations of 
the two sets of quantities equal. By the method of least squares 
c could be chosen so that the correspondence should be closer 
than with the value given by the calculation in the text. 

M 
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4. Periodic Figures. 

We now come to the consideration of periodic figures ; that 
is, of figures which within a given period, in a year for instance 

when returns are monthly, reach maxima and 

Psriodio flgnroB. , _ 

minima at assigned times, and show fluctuations 
recurring with regularity in successive periods. In physical 
phenomena, such as the sunrise, the same daily numbers will 
represent the phenomena, almost without change, year after 
year. In the case of the tides we find a link between the 
more rigid annual curves of seasonal phenomena, and the less 
marked periods of social statistics ; for the tides are subject to 
separate influences with periods of 24 hours, 24 hours 50 min., 
29 days, I year, and others, and the effects of these influences 
are often masked one by the other. In the weekly figures of 
the Bank of England, Jevons discovered monthly, quarterly, and 
annual periods.* 

In social and industrial statistics we usually find an annual 
period, combined with a general slow movement upwards or 
downwards, and confused by an irregular period of about ten 
years, due to alternate inflation and depression of trade. The 
influences of these three movements on the resulting numbers 
can be investigated, and the general methods of examining 
periodic figures fully explained by the complete discussion of one 
example, viz., the monthly returns of want of employment of the 
Friendly Society of Ironfounders. For another example the 
reader is referred to Jevons' essay. On the Frequent Autumnal 
Pressure in the Money Market ;* and for an exercise, to the 
monthly gazette wheat prices, where the gradual change of the 
shape of the annual diagram can be traced in relation with 
the increasing influence of harvests in all the quarters of the 
globe. 

These figures are specially suitable for showing graphically 

a double period, and the influences of rapid annual fluctuations and 

Qenerai features general movements of longer period on each other. 

of the figures. Looking at the table on p. 179 along the lines for 

the several years, it will be seen that there is always a fall in the 

middle of the year. Looking down a vertical column under any 

* See Investigations in Currency and Finance. 
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Number of Unemployed Ironfounders, expressed as percentages 
of estimated total number of members, month by month : calculated 
from figures given in the Annual Report of the Friendly Society of 
Ironfounders, 1894. 





























Aver- 


Year. 


Jan. 


Feb. 


Mar. 


April. 


May. 


June. 


July. 


Aug. 


Sept. 


Oct. 


Nov. 


Dec 


age for 
Year. 


1855 


II. I 


14. 1 


14.0 


12.5 


10.0 


9-9 


8.7 


8.7 


6.8 


7.7 


8.8 


12.0 


XO.4 


1856 


10.9 


12.6 


12.2 


10.0 


9-4 


7.5 


6.9 


7.3 


6.9 


8.1 


8.7 


9.9 


9.2 


1857 


lai 


9-5 


8.7 


8.7 


8.1 


7.3 


6.8 


6.9 


6.2 


8.0 


140 


17.7 


9.3 


1858 


20.2 


20.6 


20.9 


19.8 


20.3 


17.8 


15-9 


14-3 


13- 1 


11.9 


".5 


II. 2 


X6.5 


1859 


10.6 


8.8 


6.5 


5-2 


4.0 


44 


3.2 


3.6 


3.4 


3.8 


4.6 


5.1 


S3 


i860 


4.0 


3.2 


2.6 


2.2 


1.6 


1.7 


2.3 


2.6 


2.6 


2.9 


3-7 


5.6 


2.9 


1861 


6.0 


6.9 


6.5 


7.9 


7.8 


8.4 


6.9 


7.9 


9.5 


10.7 


12.4 


13.8 


8.7 


1862 


14.5 


14.0 


14.0 


14.6 


14.4 


13.7 


11 


12.9 


12.2 


13.S 


14.9 


16.0 


X4.0 


1863 


15.5 


139 


13.6 


II. 6 


10.4 


9.3 


7.8 


7.4 


6.6 


5.3 


5.0 


9.5 


1864 


6.0 


7.1 


6.6 


5-3 


4.4 


3-3 


2.8 


2.8 


2.6 


3.3 


4.2 


8.1 


H 

n 


1865 


5-4 


5-3 


5-3 


46 


3-4 


2.9 


2.6 


§•' 


2.7 


2.6 


2.3 


^2 


1866 


4.2 


5-4 


5.1 


3.6 


5.1 


6.5 


5.9 


6.5 


6.9 


7.4 


2-3 


13.8 


1867 


12.4 


13-2 


15-4 


16.7 


14.9 


14.6 


14.2 


139 


15-7 


16.3 


18.9 


22.6 


IS7 


1868 


22.1 


20.9 


19.8 


18.6 


16.7 


15.8 


14.9 


14.7 


14.2 


14. 1 


15.6 


17.4 


'7£ 


1869 


17.3 


17. 1 


16.8 


15.6 


15.2 


13-6 


13-3 


II. 8 


13- 1 


136 


14.8 


^5-3 


V, 


1870 


14.5 


10.9 


8.7 


7.2 


5.0 


4.5 


3.7 


4.5 


4.9 


5.0 


5.6 


8.3 


1871 


7.2 


5.6 


3.6 


2.8 


1.6 


1-5 


1.6 


1.2 


.9 


1.4 


I.I 


2.2 


2.6 


1872 


I.I 


I.I 


•9 


.8 


1.2 


.7 


.9 


I.O 


1.3 


1.8 


2.6 


4.1 


1.5 


1873 


3-3 


2.8 


2.7 


2.5 


2.1 


2.0 


3.0 


4.9 


4.3 


3.3 


3-3 


5.1 


3-3 


Average 




























185573 


xa3 


Z0.2 


97 


8.9 


8.2 


7.7 


71 


7.2 


7.1 


7.5 


8.5 


xa4 


8.6 


1874 


4.9 


3.9 


3-9 


3.5 


4.9 


3.9 


3.8 


3.4 


3.5 


3.7 


3-9 


5-0 


4.0 


1875 


4.6 


3.4 


3-5 


2.8 


2.8 


2.8 


3.3 


H 


3.6 


4.1 


41 


5.0 


3^6 


1876 


4.9 


4.9 


4.9 


5-4 


4.8 


5-2 


5.7 


5-5 


6.4 


6.4 


6.2 


10.3 


S9 


1877 


7.7 


7.4 


7.0 


6.9 


8.4 


7.6 


7.4 


7.8 


9.6 


10.9 


12.3 


16.3 


9.1 


1878 


14.0 


14.3 


135 


153 


13-3 


14.6 


13-6 


13.2 


133 


14.0 


15.7 


21.0 


X4.7 


1879 


23.2 


23.8 


24.7 


25.5 


22.3 


234 


21.5 


22.6 


22.5 


21. 1 


18.0 


16.6 


22.x 


1880 


15-2 


12.9 


II. I 


10.0 


10.0 


9.7 


9.8 


10.0 


10.0 


9.2 


9.2 


10.2 


xa6 


1881 


".5 


10.8 


10. 1 


10. 1 


7.6 


7-5 


6.5 


5.8 


5.6 


5.4 


5.0 


6.6 


7.7 


1882 


5-5 


5-5 


53 


4.5 


3.6 


3-8 


3-2 


3-4 


3.6 


4.1 


4.4 


6.0 


4-4 


1883 


3.6 


4.8 


5-2 


4.3 


4.2 


3.6 


3-9 


43 


4.3 


4.2 


4.0 


6.6 


4*4 


1884 


6.1 


6.2 


5-9 


6.5 


6.5 


6.9 


6.5 


7.6 


8.1 


7.8 


9.8 


10.9 


7.4 


1885 


ia2 


1 1. 1 


10.0 


10. 1 


9.8 


9.1 


9.8 


10.7 


II. 8 


II. 6 


12.7 


13.6 


X0.9 


1886 


14. 1 


15.0 


152 


15.5 


13.4 


13- 1 


12. 1 


12.7 


136 


13-9 


12.7 


12.9 


13.7 


1887 


12.4 


II. 6 


10.2 


91 


9.2 


10.6 


9.2 


8.8 


9.6 


9-4 


9.4 


9.1 


9-9 


1888 


7.8 


7.5 


6.4 


6.4 


5.9 


5.2 


5.7 


5.0 


5.1 


4.8 


3.2 


3.5 


S5 


1889 


3-1 


3.3 


2.4 


2.1 


1.7 


1.6 


1.7 


1.7 


1.6 


1.5 


1.2 


1.4 


X.9 


1890 


1-3 


1-3 


3-2 


3.1 


2.8 


2.4 


2.4 


2.7 


2.7 


2.7 


2.7 


2.7 


2.5 


1891 


3.9 


3.5 


4.2 


4.2 


4.6 


4.0 


4.5 


48 


5.4 


5.6 


5.7 


6.3 


t? 


1892 


7.0 


7.2 


7.9 


8.1 


7.9 


7.9 


7.7 


7.6 


9-3 


II. 4 


10.9 


12.0 


1893 


11.5 


II. 2 


10. 1 


7.7 


9.6 


8.3 


8.3 


9.2 


11.7 


11.9 


11.5 


11.5 


X0.2 


Average 




























1874-93 


&6 


8.5 


8.2 


8.Z 


7-7 


7.6 


7.3 


7.5 


8.x 


8.2 


8.Z 


9-4 


&x 


Average 




























1855-93 


9.4 


9.3 


8.9 


8.5 


7.9 


7.6 


7.2 


7.4 


7.6 


7.9 


8.3 


9.9 


8.3 
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month, it will be seen that there is no generally marked ten- 
dency towards increase or diminution, for high and low numbers 
occur in the first as well as the last few years. The most notice- 
able feature of these figures is the alternation of groups of years 
of high and of low numbers. Percentages above lO will be found 
in 1861-63, 1866-70, 1877-81, i884-87,and 1892-93. Let us choose 
for examination the period 1866-70. The figure for January 
1866 is below the Januaries of previous years ; those of February, 
March, and April are also low; from May to September the figures 
are greater than those of 1865 or 1864 ; from October to Decem- 
ber they are greater than those of 1863, 1864, or 1865 ; in De- 
cember 1867 they are greater than any previous year. Most of 
the figures for 1868 beat the record up to that date; but from 
September 1868 the figure is lower than the one twelve months 
earlier till July 1872. This wave of unemployment then lasted 
from May 1866 to September 1872. 

Now> let us watch the seasonal influence. In 1866 there 
was no fall in the summer except in April, and there was a very 
Seasonal rapid rise in December. In 1867 a fall in May 
influenoe. and a slight fall from June to August was followed 
by a rapid rise in November and December. There is a fall 
from December 1867 to September 1868, but a rise follows in 
October, November, and December ; since the rise does not 
generally begin till August, it will be seen that the general 
fall did not much delay the seasonal effect. In the next year, 
1869, there is a fall to a lower minimum in August, but now 
the rise in December is very slight, next year the fall is very 
quick to August, but the seasonal rise is not delayed. From 
this it is clear that the seasons had their effect throughout the 
fluctuation except in the opening year 1866, when there was 
no fall, and that the rises in the autumn were very much 
accentuated. Almost identical remarks would apply to the 
period August 1875 to May 1881. In what month was the 
depression of trade 1867-70 at its worst? The greatest figure 
given is 22.6 per cent, in December 1867, but unemployment 
in December is generally greater than in any other month, and 
the figures for any of the following six months may be more 
unusual ; the determination of the exact date will be best shown 
by diagrams. It may be mentioned that most of these remarks 
were suggested by Mr Hey, the former secretary of the Iron- 
founders* Socitey, who drew up these figures. 
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If we now turn to the diagram, the following facts may be 
noticed. The thick line showing the annual average percent- 
Tho itoiy from ages shows a downward tendency till 1857, fol- 
the diagram, lowed by an abrupt rise and fall in 1858, then 
three years' rise to its original height, returning to a minimum 
in 1865 ; the next wave covers six years, and is marked by an 
extraordinarily sharp rise in 1867, and a very low minimum in 
1872. The exceptional condition of trade in 1872 could not 
last, but the rise is very gradual to 1876, when the next cycle 
of trade is marked again by a six years' wave : the rise is 
not so steep as in the former fluctuation, but lasts longer, and 
a higher point is reached : the fall is at about the same angle, 
and the minimum in 1882 is about the same as that in 1865. 
The next wave came before it appeared to be due, and lasted 
seven instead of six years, but was much more moderate, and 
again the rise was sharper than the fall. The minimum of 
1889 did not endure, and the figure ends with a suggestion 
that the maximum will be in 1894, but only at a moderate 
height, and the next minimum might be expected in 1898 
or 1899, if causes similar to those which influenced earlier trade 
depressions were still acting. It may be found, in fact, from 
the Board of Trade returns, that, taking all the trade unions who 
made returns together, the maximum month was December 1892, 
and the maximum year was 1893 ; after this the fall is regular 
to 1897, and a trifling rise in 1898 is followed by a very low 
figure for 1 899.* 

In figure 5 the diagram is inverted and greatly compressed, 
showing now the percentage employed. If the period 1876-82 
is cut off* by two vertical lines, readers may see how great were 
the amounts of labour lost to the country and wages to the 
workers in those years, and will agree with Professor Foxwellt 
that irregularity of employment is one of the greatest evils 
endured by the working classes. 

In figure 5 the annual averages are smoothed by the method 
explained above (p. 152), a seven-yearly average J being taken 



♦ See Annual Abstract of Labour StatisticSy 1895, p. T2i^ ^or various 
methods of treating these figures similar to those here discussed. 

+ See Lectures on the Labour Question^ 1886. 

X For smoothing and studying periodic curves, see Professor Poynting's 
paper in Statistical Journal^ 1884. 



1 82 



ELEMENTS OF STATISTICS. 






to correspond to the general wave length. It will be seen that 
there is no very marked tendency up or down in the thirty-nine 
years, and that the smooth line is never far from the general 
average of employment, 91.7. 

The comparison of this diagram with that illustrating ex- 
ports (p. 151) is very instructive. Some of the results may 
be thus exhibited : — 





Dates 


OF 




Dates 


OF 


Minima 




Maxima of 


Maxima 




Minima of 


of Exports. 




Unemployment. 


of Exports. 




Unemployment. 


1862 




1858 and 1862 


1866 




1865 


1868 




1868 


1872 




1872 


1879 




1879 


1882 




1882 or 1883 


1886 




1886 


1890 




1889 


1894 




1893 









The figures may also be compared graphically by the methods 
of the previous or following sections. 

The averages for the nineteen Januaries, nineteen Februaries, 
&c., in the years 18^5-73, and similar averages for the years 
Maaiurenumt ^ 874-93, and the whole period are given in the 
of Boasonai table and exhibited in figures 2, 3, 4. When we 
calculated the annual averages just discussed we 
eliminated by that process the seasonal fluctuations ; by this 
new series of averages we eliminate the influences of particular 
years. If we took, for instance, all the November numbers out 
of a series of figures totally uninfluenced by the seasons, if such 
could be found, and compared these with the general average 
for all months, we should in the long run find just as many 
instances above as below this average ; but if the figures were 
influenced by the seasons, we should find a considerably greater 
number above than below, or vice versa. The greater the 
seasonal influence, the greater would be this excess or defect. 
Averaging numbers in this way eliminates the non-seasonal 
causes, for by hypothesis the excesses and defects due to them 
will in the long run balance one another ; and except by 
averaging these cannot be eliminated, unless they can be actually 
calculated. The excess of the November average above the 
general average will be greater than that of October, if the 
seasonal causes exert more influence towards excess in the 
former than in the latter month, and the curve which shows 
these averages will show a resemblance to that which would 
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be obtained if the non-seasonal causes were absent. It will 
be only a resemblance for two reasons : first, because in the 
comparatively short series of years with which we are generally 
obliged to be content, a very effective non-seasonal cause will 
leave its mark on the average, as may be seen in the table on 
p. 179; secondly, because seasonal and non-seasonal causes are 
often not independent ; a depression of trade is accentuated by 
a sharp winter ; a bad season in a year of bad trade may increase 
the want of employment greatly and suddenly, while a good 
summer in a prosperous year may reduce it almost to zero. 
In the case we are considering the interaction of causes tends 
to exaggerate the seasonal maximum and diminish the mini- 
mum ; in other cases a contrary effect might be found. 

In figures 2, 3, 4 the curve for the latter half of the year 
is prefixed to that of the calendar year, because the character 
of the yearly waves is seen most clearly from minimum to 
minimum. It may be noticed that the wave in figure 3 is 
less definite in shape and has a smaller rise and fall than that 
of the earlier period shown in figure 2 ; it would appear that 
the seasons are losing their influence. 

If there is a definite annual period, that represented by 
figure 4, it may be expected that a figure of a shape similar 
to this — 

5 ■■§■■■■■■■ 5 




will be repeated annually in figure i ; it is shown well in 1864, 
1882, and other years. In the great majority of cases the yearly 

Thouuiiiai maximum is reached in December or January ; at 
^*^- the end of 1858 the maximum is absent, but is 
replaced by a break in the rapidity of the fall ; at the end 
of i860 there is a rise, but the spring fall following is checked 
by the general upward trend ; similar remarks apply to all 
the great fluctuations. There is no doubt that right along the 
line we find at nearly equal intervals these pointed crests above 
the line of averages. 

•The minima are not so conspicuous, for the pointed shape 
IS absent, trifling causes bring them near the smoothed line, and 
they are easily masked by a general fall or are absent because 
of a general rise. In 1861, however, there is a distinct minimum 
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in spite of the strong upward tendency ; the minima are very 
conspicuous throughout the fluctuation of 1865-70; and from 
1859 to 1888 the minima are fairly marked, except in 1876, 
1880, and i88l. 

The following figures show the effect of a stationary, rising, 
and falling average annual rate on the shape of the seasonal 
wave : — 

; on sutionoiy line of Bvenkgcs. 



Jan. Dec. [ J»n. Dec 

t. Seasonal wave supeiimposed on rising line of averages. 



These figures are drawn by adding or subtracting the average 
monthly differences from the general average 



month by month to or from the positions shown on the straight 
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lines joining the annual averages. On a rising line the spring 
fall tends to become horizontal and the autumn rise steeper; 
on a falling line the spring fall becomes more rapid and the 
autumn rise is checked. 

If this seasonal wave, added to the slower long-period 
changes, were the complete explanation of these numbers, 
figure I (p. 179) would be entirely composed of modifications 
of figures a, b, and £. Figure a is exemplified especially 
in 1855-57, 1864-65, 1S71-73; figure b in 1860-61, 1866-67, 
1S77-7S, 1883-85 ; figures in 1859, 1863, 1880-82, 1886-89. 

As explained above, the two sets of causes are not indepen- 
dent, and these figures are not reproduced exactly ; but the 
BUmmauonoT resemblance is sufficiently close to make the 
flDotmuou. following method of eliminating seasonal fluctua- 
tions partially applicable. Combine the monthly excesses and 
defects just given with the original numbers, by subtracting the 
excesses and adding the defects ; this process should tend to 
produce 3 straight line, thus : — 



- from li^re i. 
. correcled iiguies. 



But the result is not more than a tendency, because of the 
unusual fall in January 1883, and it is difficult to find a perfect 
example. This method is applied in figures 6, 7, and 8 in an 
attempt to disentangle the seasonal fluctuations from the effects 
of the commercial crisis of 1872, the depression of 1879, and the 
turn of the tide in 1883. In figure 6 it is seen that January 1872 
was the best month relatively, though the absolute minimum 
was not reached till June of that year ; from this it appears that 
January 1872 was the turning point of the great inflation, a date 
somewhat earlier than that generally given. The date of the 
maximum of 1879 is left unchanged by this process, and that of 
the 1889 minimum is only shifted one month. 

We have still to discuss the criteria of the existence of a 

period. In figure i the optical evidence is sufficient to suggest 

cnutiaotaxbt- the annual period, but it may be doubted whether 

wmofpnioa. an annual fluctuation would be suggested by a 

diagram representing wheat prices. It is clear that if the 
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monthly entries of any returns whatever were averaged in 
months over any period of years, that the averages for January, 
February, &c., would not be exactly equal, even if there were 
no seasonal influence. The following diagrams show various 
averages ; — 

Unemployed iionfoundeis 



July Dec. June Jan. Dec. 

Wheat prices, shillings per Average date of first Sunday 



Jan. Dec. Jan. Uec. 

Of these the first three may be expected to be seasonal, while 
the last, which shows the averages of the dates on which fell the 
first Sunday in 20 Januaries, 20 Februaries, &c., in a series of 
years, certainly is not 

The following simple tests may be applied to decide this 
point. If the period is in any way connected with the seasons, 
it will correspond to some extent to the ordinary weather charts 
of temperature, &c., which have a single annual maximum and 
corresponding minimum. Phenomena affected by the weather 
may also be expected to show a single maximum, nearly coin- 
ciding with the maximum or minimum temperature; thus the 
maximum unemployed coincides with the minimum length of 
daylight and precedes the minimum temperature. In some 
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cases a second subsidiary maximum may be shown, since, for 
example, an excessive death rate may be due to excessive cold 
or heat ; but even in this example further analysis would pro- 
bably show that the one maximum was for the old, the other 
for the young. Wheat prices may also show two minima due 
to the harvests in the two hemispheres. The " Sunday " curve 
just given shows four maxima, and is not seasonal. More than 
one maximum is evidence against periodicity till their existence 
can be explained. 

The second test is to look at the serial diagram and notice 
how often the maximum occurs in the same month; non-periodic 
Probauuty causes will hide the maximum occasionally, but in 
*•■*• the long run one month will be predominant In 
figure I the maximum occurs in March and April twice each, 
in February three times, in January eleven times, and in Decem- 
ber twenty-one times. The maximum is then generally in 
midwinter. The minimum is not in this case so well defined. 
The following table shows how this analysis can be ex- 
tended : — 

Times 
out of 39. 
The percentage of December is greater than that 

of the preceding November - - - - 33 
The percentage of December is greater than that 

of the following January - - - - 28 

The percentage of December is greater than that 

of the precediivO^uly 33 

The percentage otT)ecember is greater than that 

of the following July 30 

The chances against so great a preponderance, if the seasons 
had no influence, are respectively 70,CXX) to i, 106 to i, 70,000 
to I, and 940 to i.* All the months may be separately tested 
in the same way. This method by no means exhausts the 
evidence, for we have only considered which of two months 
is the greater, and not how great is the excess when it exists. 
On this point the reader is referred to the paper by Professor 
Edgeworth, On Methods of Statistics^ in the Jubilee Volume 
of the Royal Statistical Society, p. 206 ; this should, however, 
be postponed till the mathematical treatment which follows in 
Part II. has been studied. 



* See Part II., Sect. I., infra. 
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5. Logarithmic Curves. 

A serious flaw in the graphic method as used in the previous 

sections is that, when we are dealing with a series of increasing 

Need for graphic ^gures, though the totals year by year may be 

reprMenuuon increasing, we are compelled to represent equal 

increments on these totals by equal vertical dis- 
tances ; thus an increment of ;^20 on a total of ;^20 is repre- 
sented by the same vertical distance as an increment of /^20 on 
a total of £2yOOO, Thus in the annexed figure representing 
exports, the fall from ;f 52,000,000 to ;^42,ooo,ooo in 1815-16 is 
barely noticeable, though it is a fall of 20 per cent., and was 
connected with very great distress in the manufacturing dis- 
tricts ; while the fall from ;^305,ooo,ooo in 1883 to ;^269,ooo,ooo 
in 1886 attracts attention immediately, though it is one of 
12 per cent. only. Again the increase of 34 per cent, which 
took place between 1848 and 1850 appears insignificant in com- 
parison with that of 29 per cent, from 1870 to 1872. When we 
are attacking questions of causation it very frequently happens 
that we are more concerned to know the proportionate increase 
than the actual increase. When we are considering the gradual 
growth of our foreign trade, or when we are comparing the 
growth of trade of two countries, a diagram like that annexed 
is likely to give quite a wrong impression of the struggle that 
marked the early stages. We need then a diagram not of 
quantities, but of ratios, where equal vertical distances represent 
no longer equal absolute increments, but equal proportional 
increments, that is, equal rates of increase. By the use of 
logarithms a universal scale can be constructed which serves 
this purpose. The non-mathematical student can easily accustom 
himself to the use of diagrams so constructed, by studying one 
where the actual amounts represented are entered, and noticing 
that whatever part of the scale he takes, doubling, halving, in- 
creasing by 20 per cent, and so on, are always represented by the 
same vertical distances respectively. The construction of a 
oonsunouonof diagram on this scale is as follows: — Write down 
a logaritbniio the numbers in the series to be represented; 
^^"°* against them \yrite down their logarithms ; on 
paper divided into equal squares mark at equal intervals on a 
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vertical line numbers ascending in regular progression so as to 
include all the logarithms found ; mark off the dates on a 
horizontal line ; and on the scale thus prepared mark in 
the logarithms, instead of the original numbers. The table on 
p. 191 and the diagram facing p. 190 show the figures of imports 
and exports thus treated. On the right hand of fig. 2 the position 
of the absolute numbers is given ; on the left the correspond- 
ing logarithms. A given vertical distance, i inch, represents 
the distance .301 on the logarithmic scale ; if we add this 
quantity to the logarithm of any number, we obtain thfe 
logarithm of twice that number for log ^ + .301 = log a 
+ log 2 = log 2a \ for instance, if we increase the height of 
the position which represents ^^30 by i inch, we arrive at the 
position which represents £60, Again if we now add 1.59 of an 
inch, which represents .477 on the same scale as before, that is 
log 3> to the logarithm of 2^, we obtain log 6^z, and we have — 

log 6a = .477 + log 2a = .477 + .301 + log a, as above 
= .778 + log a= log 6 + log a ; 

that is, we arrive at the same position on this scale whether we 
go by means of two separate ratios or by a single compounded 
ratio. Thus a diagram drawn on this principle satisfies the 
necessary conditions that equal vertical distances represent the 
same process in whatever part of the scale they are taken, and 
that any number of points can be entered without leading to 
inconsistencies. At the end of this section is given a table of 
the logarithms of i to 1,000, correct to the third decimal place, 
which will be found sufficient for this purpose. 

Thus on the diagram given we can see at once that imports 
were doubled in value between 18 10 and 1836, again between 
EzampiMof 1840 and 1 85 3, again between 1855 and 1866, 
its use. and that their value increased 40 per cent, be- 
tween 1886 and 1899. Or we may notice that the excess of the 
value of imports over that of exports was 40 per cent, of the 
latter both in 1850 and in 1880; that the value of imports in 
1899 was thrice that of exports in i860. 

If the eye has been carefully educated to understand a 
diagram of this sort, if the fact that it is a diagram of ratios^ 
not of quantities, is firmly impressed on the mind, then the 
diagram answers perfectly the object of the graphic method, 
that is, it gives a true instantaneous impression of a complex 
series of facts. If, on the other hand, it is found that a true 
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impression is not received, through inability to take the right 
mental position, then diagrams on the natural scale should be 
employed only, always with the recollection that they may give 
false impressions of ratio.* 

It is to be noticed that no base line should be given in 
diagrams of this class, otherwise a false impression is at once 
Velocity and obtained. Notice further that, while equal verti- 
aoceieratioxL ^al differences represent equal ratios from any 
part of the diagram to any other, instead of equal increments as 
o*n the natural scale, equal degrees of slope represent equal ratios 
of increase (equal accelerations), instead of equal additions in 
equal times as on the natural scale (equal velocities). On the 
logarithmic scale a line rising with convexity to the horizontal 
shows that the ratio of increase is growing, as in imports from 
1830-1853 (if the line is smoothed), while concavity, as from 1854 
to 1873, shows a slackening ; but on the natural scale the line is 
convex almost throughout the two periods, showing that the 
actual increments were increasing all the time. 

It would be useful, if space permitted, to offer several 
diagrams on both scales ; for in many series of figures the 
uaeftai imu- ^^fi^^dces exhibited by the two methods are very 
oation to index- instructive. One case maybe signalized where the 
numbers. logarithmic scale is specially important, that is, 
when the original numbers represent ratios, not actual numbers. 
Thus in Mr Sauerbeck's well-known diagram, drawn on the 
natural scale, representing his index-numbers of prices, all the 
numbers included are percentages of their values in certain 
defined years. Suppose that 100, 80, and 60 are the index- 
numbers for three years, then on the natural scale the decre- 
ments are represented by equal distances and appear to be 
equal. The falls in the value of gold, however, are by no means 
equal in the two periods. In the first, the fall from 100 to 80 
is one of 20 per cent. ; i6s. at the second date would buy goods 
which cost £1 at the first. In the second, the fall from 80 to 60 
is one of 25 percent; 15s. at the last date would buy goods 
which cost £i at the middle date. For the purposes of price 
index-numbers it is ratios which arc important and which the 
diagram should represent 

♦ Professor Marshall suggests a simple method of correcting this false 
impression in his paper On the Graphic Method of Statistics^ in the jubilee 
volume Qi X\i^ Journal of the Royal Statistical Society^ p. 257 seq. 
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The logarithmic scale has special uses in the comparison of 
series of figures, and the methods discussed in the section 

oomporiBoiiB on ^^^^^^^ ^^ ^^^^i^ subject can be readily adapted. 

the logarithmio The difficulty of the choice of units in comparing 
'~^- quantities of different natures disappears when we 
deal only with ratios ; we need no longer trouble about the 
method of percentages. In investigating causal relations we are 
more likely to find close connection in ratios than in quantities ; 
for if one set of phenomena are connected with another, it is 
more likely that the relation will be a proportional one {e,g,y 
that an increase of lo per cent in some measurable charac- 
teristic of the one corresponds to an increase of 8 per cent, in 
a characteristic of the other), than an absolute quantitative one 
(^.^., that an increase of 2s. in a price, at whatever point it 
stands, corresponds to a decrease of lOO in the number of 
purchasers). Resemblance between two curves on the loga- 
rithmic scale will mean the correspondence in proportional 
change, while resemblance on the natural scale means corre- 
spondence in absolute change. 

There is less trouble in this new method in equating averages 
than before. For if the logarithms of two series are taken, it is 
quite immaterial at what height on a logarithmic scale the two 
are plotted out ; alteration of height only means multiplication 
of all the items by a constant quantity, and does not alter the 
appearance or proportion of their fluctuations. The method to 
be employed is as follows : — Draw the curves representing two 
series of figures on a logarithmic scale; then shift the lower 
curve vertically upwards to and over the other, till the closest 
possible correspondence is obtained ; draw it in in this position, 
and the two series can be accurately compared. 

The following example employs this method with a further 
development, corresponding to that of p. 177, supra^ where 
Equation of fluctuations are equated. In the earlier method 
fluotnationa. ^e used the average as a position from which to 
measure the various items, and adapted the scales ; a similar 
method might again be used, but it is more convenient to keep 
to one logarithmic scale, and now we have no base line to 
consider. Calculate the fluctuations much as before, but express 
them as percentages of the adjacent maxima before taking their 
average. In the following example it is found that a fluctuation 
of 8.4 per cent, in the number employed, in those trade unions 
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whose returns are accessible,* corresponds to one of 9.7 per cent, 
on the marriage rate. To investigate a possibly closer corre- 
spondence, assume that a portion of the number employed do 
not influence the marriage rate, and find what part must be 
subtracted before this 8.4 per cent, of the total forms as much 
as 9.7 per cent, of the remainder; the average percentage of 
members of the trade unions at work in the selected period was 
95.1 ; 8.4 per cent, of this is 7.99, which forms 9.7 per cent, of 
82.4. Thus 12.7, the difference between 95.1 and 824, may be 
considered as not influencing the question, and subtracted 
throughout before logarithms are taken. This process would ^ 
be replaced on the natural scale by equating the averages of , 
two series, and drawing one base line so far below the other 
that average fluctuations would be represented by the same 
vertical distance for both series ; which process is exactly 
equivalent to that adopted on p. 177. Expressed algebraically, 
we are now investigating the equation — 

log (y-c)-log X = kySL constant, 

where c and i are constants to be so selected as to give the 
closest fit, and ^ and x are the quantities to be compared. 

In the following diagrams, figure i gives the figures in the 
natural scale ; figure 2 gives them on the logarithmic scale, after 
they have been arranged so as to make average percentage 
fluctuations equal ; while in figure 3 the shorter period, 1880-96^ 
is treated in a method precisely similar to that of figure 2. 

* The figures in columns 2 and 4 in the second table on the next page are 
taken from Mr G. H. Wood's paper on Some Statistics of Working Class 
Progress since i860, Statistical Journal^ 1900, where a valuable logarithmic 
diagram will be found, illustrating many of the points of this section. 
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CHAPTER VIII. 
ACCURACY. 

Introductory. 

• 

There is not in existence a perfectly accurate measurement, 
physical or economical, just as there is no perfectly straight line 
The nature or OX perfect fluid. We can best illustrate the nature 
meuurement. Qf economic measurements by considering that of 
physical. It is easy to weigh substances accurately to i gram : 
then by obtaining a good balance, we can, as our apparatus is 
improved, weigh accurately to a centigram, milligram, and one- 
tenth of a milligram ; but for accuracy beyond this the balance 
fails us. Similarly in measuring angles, the naked eye can 
distinguish an object which subtends one-thirtieth of a degree ; 
with a sextant a measurement can be taken correctly to fifteen 
seconds of arc ; the Greenwich astronomers can make observa- 
tions correct to one-hundredth part of a second, but we again 
come to a point beyond which precision is unattainable. 

In such cases the result is stated as correct to a milligram, 
or whatever it may be ; in the same way we speak of an esti- 
mated sum of money correct to a pound. 

A task which has considerable resemblance to some statis- 
tical estimates, is the measurement of the parallax of the sun, 
Fbyiioai ftnd which determines its distance from the earth. 

ttatistioai During the eighteenth century astronomers esti- 
meuurementi. jj^g^^gj j^ ^ ^q^^ equivalent to 96,000,000 miles. 

As methods of observation and instruments were improved, 
observers began to agree that the whole number of seconds was 
8, but gave various estimates for the first decimal figure. Since 
1865 there have been very few estimates which have not given 8 
as the nearest figure for this place (8.8"), while more recent 
observations agree in making the parallax from 8.76" to 8.78". 
We may, therefore, consider that the distance is now accurately 
known to within i in 400. Notice in this connection, first, that 
the earlier observations have been subject to corrections ; 
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secondly, that better agreement has been attained as time has 
gone on ; thirdly, that neither absolute agreement nor ab- 
solute accuracy have yet been obtained. So it is with statistical 
measurements ; we might instance the gradual settlement of the 
curve representing expectation of life, the measurement "of the 
fall in prices, and the development of wage statistics. 

Again in physical measurements, though we can sometimes 
reach a very high degree of accuracy, as, for instance, in the 
Degrees of pM- weight of a cubic foot of water which could doubt- 
ubie aoooTMy. iggg be known correctly to one part in a million, in 
other cases we are glad if we can measure to one part in ten, as, 
for instance, in the distance of the nearest fixed star from us, 
which is, roughly, from 34 to 37 billion miles. So in statistics 
it is something if we know that the total capital of the United 
Kingdom was between 7^ and 10 thousand million pounds in 
1885, or if we know that the average weekly wage of working- 
men in full work was from 21s. to 27s. in 1886. The weak point 
in such statements is that often when we have made an estimate, 
which we know to be inexact, we are 'not able to give any esti- 
mate of the limits of the error. We are not so definite as The 
Modern Traveller who 

"... knew the weather to a T, 
The longitude to a degree, 
The latitude exactly.*' 

We are not able to say "our estimate is 24s. 5d., we are not 
certain to id., but it is not possible that we are as much as 
IS. wrong" ; whereas in physical measurements we can often give 
the result correct to the smallest graduation of the instrument 
employed. 

On the other hand, though we cannot obtain exactness, we 
can in many cases estimate to that degree of accuracy which is 
Theaocvraoy required for practical purpose. In common use 
generally needed. Q^iy a certain conventional accuracy is needed. 
Thus, to take some miscellaneous instances, the area of an estate 
is given in acres, roods, and poles, but not correct to square 
yards ; the market prices of shares do not change less than 
xV ; we keep the day, not .the hour, of our birth ; railway 
time-tables do not show seconds ; ocean steamers are timed to 
start at certain hours, not minutes ; height is measured correct 
to one-tenth of an inch ; a hundred yards race is timed to one- 
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tenth of a second. Similarly in statistical estimates, we seldom 
need that our results shall be accurate within one per thousand, 
or even i per cent. One per thousand of the working week is 
only three minutes ; i per cent, of the week's wage is only 3d. 
We do not care to know the population of London within 100, 
the expenditure of the Exchequer within ;^ 1,000, or the expecta- 
tion of life within a day. It is often possible to attain practical 
accuracy within such limits. 

Definition of Error. — For purposes of measurement we 
may take the following definition : — The error in an estimate 
is the ratio of the difference between the estimate and the true 
value^ to the estimate ; the error is to be reckoned positive when 
the true value exceeds the estimate. 

Thus if the average weekly wage of agricultural labourers 
was in reality 14s., and we estimated it as 13s., our error would 

be li— ii = i-, or J.f per cent; if we had estimated it as 15s., the 

error would be 1^-^= — i^, or —6.6 per cent. 

IS IS' ^ 

In algebraic notation, if u be the measurement of a quantity whose 

true value is 1/^, then is the error in the estimate, which we shall 

u 

call €\ so that e = , and «^ = « (i + ^).* «- is an appropriate measure 

u ' e 

of the accuracy or precision of an estimate, becoming infinite when the 

error is zero. 

In the nature of things, when we are dealing with errors, 
we do not know their magnitude ; the most we can know 
BtataiiMnt of is their probable and possible extent We 
•*™** might estimate, for instance, the percentage of 
unemployed in a certain year as 4.5, and add, from informa- 
tion in our possession (coming from a study of wage -bills 
or the reports of relief agencies), that we considered this to 
be within .5 of the fact ; we should then write the number 
4.5 ±.5, meaning that the error in the estimate as defined above 

was unlikely to be more than -^ = -• or 1 1 per cent, and the 

* This and most of the following^ algebraic paragraphs are from a paper 
on the Relations between the Accuracy of an Average and that of its 
Constituent Parts^ by the present author, in the Statistical foumal of 
December 1897. 
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precision was 9. In such a case we can also give definite limits. 
The percentage unemployed must lie between o and 100; and 
if we could actually enumerate i per cent, of the working-class 
as out of work, and also 92 per cent, as in work, we should 
know that the number required was between i.o and 8.0 per 

cent, and the maximum error in our estimate, 4.5, was 2:| = -, or 

T7 per cent. Even this is more precise than the original state- 
ment, " the percentage is 4.5, error unknown." By further investi- 
gation we might perhaps bring the limits of error nearer to 
each other, and decide that it was practically certain that the 
percentage required was between 4 and 5 ; then we ought to 
say " the number unemployed is .04 '. . . of the working-class, the 
estimate being correct to the last figure given." This statement 
is of the same nature as, " The body weighs 1 5 lbs. 3 oz., correct 
to an ounce." 

While, on the one hand, it is clear that we cannot often 
obtain close definite limits to our errors, on the other we can 
very often see that some of the digits in a total are almost 
certainly right and others almost certainly wrong. Thus when 
we see in the Registrar-Generars Report that the population of 
the United Kingdom in 1895 was 39,124,496, the estimate being 
made from the census of 1891, and the increase calculated on 
the basis of the increase since 1881, we may be certain that 
the last two, or the last three, digits are no better than guess- 
work ; while the first two, or the first three, are correct. Thus 
the statement should read: Population was 39.1 millions, or 
39,124,000+5,000, or whatever figures our examination of the 
varying rate of progress of the population led us to adopt, and 
this statement is actually more correct than the previous one. 

It is the custom in many classes of estimates to give the 
figures to the uttermost farthing. This is possibly right in 
NegiMt official publications ; for the business of the office 
ofminutiB. jg ^o receive and tabulate returns, stating how 
and whence they came, and leaving to the economist or the 
statistician the task of deciding the degree of accuracy per- 
taining to them. But in summary descriptions and accounts, 
and in scientific estimates, it is not merely unnecessary to give 
these last figures (both because they are not accurately known, 
and because they generally have no importance to the argument 
or significance to the reader), but it is positively inaccurate. 
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The easiest way to avoid the inaccuracy is simply to state totals 
in so many thousands (^.^., the earth is 2,000 miles in diameter), 
or if for any reason more exact measure be required (as when 
we are comparing the equatorial diameter with the smaller one 
through the poles), the scientific way is to give the number as 
far as it has been fairly calculated, and to indicate its precision. 



Rules for Computing the Effect of Errors. 

We may now give some rules connecting the errors of a 
complex estimate with those of the elements which form it. 

I. The error in an estimated sum is equal to tite sum of the 
errors in the parts when each is multiplied by the ratio of .the 
corresponding part to the sum. 

For if we estimate n quantities as i/^, u^ , , . u^ and their sum 

Brror in tiim. ^^ *'' ^^ ^^^ « = Wj + Wg + • • • ^n. and the errors of the 
quantities are e^y e^ . , , e^^ and that of the sum is e\ 
then the true value of the sum is u (1 +^), and the true values of the 
parts are u^ (i +^j), u^i^i-^-e^ . . . , so that — 

«(i+^) = «/i(i+^i) + «/2(i+^2) + +» 

but « = «! +1/2 + + ; 

hence, by subtraction, ue^u^e^ + «2 ^2 + + > 

and e = e.x-L +^ x-^ + +. 

^ u ^ u 

The formula is easily adapted to the case where some of the 
parts are subtractive. 

To take an arithmetical example, if two trade unions return 
respectively 555 and 45 members as out of work, while the true 

numbers are 565 and 50, so that the errors are — and -, then 

J J J i III 9' 

the error in the sum is by the above rule — 

1 of ^5- + i of ^= -, or 2i per cent. 
55 600 ' 9 600 40* ^ ^ 

The greater error in the returns of the smaller union has little 
effect on the total. 

We can apply the rule to the important case where we 
can estimate a great part of a required total with considerable 
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accuracy, while we are ignorant of a smaller part. Thus we 
may receive returns from several unions that 33,650 are out 
of work, and have reason to know that the error is not more 
than I per cent, while some smaller unions do not send any 
returns ; we make an estimate for the smaller unions, say that 
1,000 of their members are unemployed, and suppose a very 
large error, say f or 67 per cent Then the error in the total is 
less than — 

-i- of 2450^2 ^f i^ = 2.9 per cent, 
100 34650 ^ 3 34650 ^ '^ * 

an error very much nearer that of the larger returns than that 
of the smaller. In the preceding sentence we say " less than," 
because we assume that we have taken an outside limit for the 
smaller error. 

II. TAe error in the arithmetic average of several estimates is 
the sum of the errors of these estimates^ when each is multiplied by 
the ratio of the corresponding estimate to that of the sum of the 
estimates. 

For if nty^ m^ . , , m^ are n estimates of quantities whose true 
Brrorin values are tn^ {\ +^1), ^2 (^ '^^2)> - ' * > ^^^ estimated and 
aTeraga. true averages are respectively — 

m^^rtn^^ . . . ffl, ^^^ ^1 (' +^i) + »g2 (' +^2)+ - - « +»gn(i+<?n ) 
n n 

and the error in the average is — 

^1 ('+^i) + ^<2 (^•*•^2)'^ + _ ^1 + ^2 '^ + 
n n ^ g|W| + ^2^2 + + 

^1 "^ ^2 **" "^ ^1 "^ ^2 "^ "^ 



where S denotes the sum of all the w's. 






It is easily seen that no individual error can have much 
influence on the result, that the error in the average would be 
nearly of the same magnitude as one of the individual errors, if 
these were not very unequal, and all positive or all negative, and 
that if, as is generally the case, some are positive and some 
negative (a point we shall consider presently), the error would be 
considerably lessened. 
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III. The error in a weighted average is the sum of{\)an error 
due to errors in the quant it ies, sttnilar to the error of an unweighted 
average^ and (2) an error due to errors in the weights^ which be- 
comes very small when the original quantities are nearly equal. 

For if w^^ W2 - * * h^ estimated weights applied to estimated 
-^. quantities w^, Wg • • • » ^^^ ^^ '^^ ^^^^ values of the 
weighted weights are w^ (i+«i)> ^'2 (' +^2) • • • » ^^^ ^^ ^^^ 
aTenge. quantities m^ (i -h^j), m^ (i +^2) • • • » ^^^" ^^^ ^^^^ ^s — 



[ 



S{»i (i 4-^). w (i +€)} ^m7V^ Smw 
S{a/(i+€)} Sw y'^ 



If we simplify this expression and neglect the products of two of the 
errors e and c (for if ^ and c are each .1, their product is only .01), we 
obtain — 

Error in weighted average is — 

L ^mW ^mW J L '2w,'2mw pain of quamitie* J 

If m^^m^ is small, that is, if two of the original quantities 
are nearly equal, the first term in the second bracket becomes 
very small. Very great errors are required in the weights to 
make any appreciable error in the average. In fact, the errors 
in the quantities have so much more influence than the weights on 
the weighted average of not very unequal quantities^ that errors in 
the weights can generally be neglected. Many numerical examples 
of this principle were given in the chapter on weighted averages. 

IV. The error in a product is approximately the sum, of the 
errors in its factors^ due regard being paid to sign. 

For li fi% f^ . . . fn are the estimated factors, whose true values 
Error in ^^^f\{} +^1X^2 (' +^2)* • • • > t^^" ^^^ error of the product 
P^'*^*- ^i(n-gi). 72(1+^2) -fvA 

= (i+^i). (1+^2) • • • -1=^1 + ^2+ +^n> if we neglect products of 
two or more ^'s. 

The ^s are equally likely, d priori^ to be positive or negative. 
If two ^s are of different signs, they tend to neutralise one 
another. The error in a product may be great if all the errors 
of the. factors are of the same sign, even if they are small 
individually. 

For example, if we estimate that 100 men are earning on the 
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average 25s. each, while in reality there are 105 men earning 
6s., the error in the estimated total sum earned is, by formula, 

5 4-;l = .09. 



o 

^ 



100 ' 25 

If, with the same estimates, the real quantities had been 105 
and 24s., the error in the product would have been -^ ^=.01. 

V. TAe error in a ratio is approximately the difference between 
the errors in its two terms ^ due regard being had to sign. 

For if «p «2 be the estimated terms, whose true values are «^i ( i + ^1) 
Error in ratio, ^^d u^ (1+^2)' ^^^^ *^^ error in the ratio is — 

1^2(1+^2) u^ ^ i+^i _ J ^ ^1-^2 
«i I + tfn ^+e 



1 * T^^2 * T^^2 



"2 



second order in the ^s. 



= ^i-^2» ^^ ^^ neglect terms of the 



If the errors in the terms are both positive or both negative^ 
they tend to neutralise one another ; if they are also nearly equal, 
the error in the ratio becomes very small. 

We can apply Rule V. to the error in comparison of two 
averages of similar quantities estimated at different dates. 

With the same notation as under Rules II. and III., using //r, w^ 
€, 6, for the letters are one date, and ni^^ w^, e\ c^ for similar quantities 
at another date, then the error in the ratio of the simple average 
of ;«j^, m^^ ... to the simple average of m^^ m^, , , is — 

K-(£.)}-f(s^J} 



_/ 1 m^ ^\\j.( \^2 ^2^ 
" K' • S^ " ^^-ST// ■*■ V^ 'Si? " '-'Sin) 



+ + 



Now if the quantities have not changed much during the period 



nt^ -11 t'rr 1-..1 r W 



between two observations, the fraction - — - will differ little from ., , 

and so on. 

Neglecting these differences in comparison with the quantities them- 
selves, a legitimate process when we are estimating the approximate 
influence of errors, we have — 

Error in the ratio of the simple averages = S < ^li^i " ^1) \ 

If the two estimates have been made under nearly similar clrcum- , 
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Stances, leading to similar chances of errors, e^ and e^ are likely to be 
not only of the same sign, but nearly equal. 

Write ^/j, ^2 • • • ^or {e^ - e^, (e^ - ^2) • • • > ^^^ ^^ ^^^^ — 

Error = S.-I^j. (^^] }-, where the d's may be small. 

The corresponding analysis for the error in the ratio of two 
weighted averages is too complicated to be given here ; * but 
using the principle that errors in weight are less important than 
errors in^ quantity, which applies with slight modifications, we 
may use the formula just given for the first approximation to 
the error in the ratio of two weighted averages. This formula 
may be put in words : — 

VI. TAe error in the ratio of two averages of similar series 
of quantities y estimated at different dates ^ is approximately equal 
to the sum of the differences between the errors in the corre- 
sponding terms of the two series^ each multiplied by the ratio of 
the latter of these corresponding terms to the sum of all the terms 
at the latter date. 

This rule is so important that it will be worth while to 

*""*" . illustrate it by an example, in which a further 

oomparlfon of 

ayeragei. quantity will be introduced. 

If in each of two years we are able to estimate, as in our example 

under Rule I., one part of a total more accurately than another part, we 

can use the following formulae : — 

First Year. Second Year. 

Estimated numbers or weights - w ; error c ; 7v^ ; error €^ 

Estimated average income, or 

quantity - - - - /w^ ; error e^ ; m^ ; error e^ 
Estimated number, less accurately 

known ... - rrf/; error in r,p; r^7^^; error in r^p^ 

Estimated income - - - m^', error e^ ; m^ ; error e^ 
fi and ^i^.are, by hypothesis, less than e^ and ^2^- 

Error in average for first year — 

7i/ (i+€) + r (i+p) 7a {1 +€) ~ Uf¥rw 

w + rw 

Ml rfHa .f Pin — Pt-x 

= ^ 1 +^ ^2 — ^.p . . __2 1. 

m^ + rm^ tn^ + rWg '^ i 4- r Wj + rm^ 

if we neglect products of e and p, 

** It will be found in the article cited on p. 201 above ; a further approxi- 
mation also for the error in the ratio of simple averages is there given. 
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Here the errors, e^ and /o, connected with the less accurately known 
part, are each multiplied by r, the ratio of the weight of that part to the 
weight of the better known part ; while e^, the remaining error, is by 
hypothesis small. 

If for simplicity of argument we assume that the ratio of the unknown 
part to the whole (but not the error in estimating it) has remained 
unchanged, and also that the ratio of the estimated average incomes of 

the two parts has not altered, we have for the error in comparison — 

• 

Thus in estimating the change in average wages of Scotch 
agricultural wages, we have figures similar in character to the 
following : — 

1867. Married Ploughmen. ^^^' ^^I'^^r^'''' 

EsiimtUed number - t,ooo Average income, ;f36 1,200 £49 o o 

Su/^^sed true numhei 1,010 ,, „ 35 1,220 48 o o 

Farm-Servants. 
EstimtUed number - 200 Average income — 240 

Money - ;£2i £2y 5 o 

Estimated value 

of board - 13 14 o O 



Total - ;f34 £41 5 o 

Supposed true number 220 Total income - £yj 240 ;f 47 o o 

Here a/= 1,000, »ii = 36, r = |, »i2 = 34, ^1= 1,200, m^ = /^(), r=|, 

ftl-_2 >,1_220 

Here it is supposed that we have overvalued the income of 
the married ploughmen, and undervalued that of the farm- 
servants in both cases. We suppose, as is the fact, that the 
value of the board and other perquisites of the farm-servants 
cannot be estimated with precision, and that the proportionate 
numbers in the two classes are not accurately known. 

Substituting in the above formula we find that the error in 
the estimated ratio of the average incomes of the two classes 
together in the two years is — 

— .006, due to errors in estimates of income of ploughmen. 
+ .008, „ „ „ servants. 

- .001, „ „ ratios of the numbers in the two 

classes. 
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Thus the last error, due to weights, is very small, and the 
second error, due to ignorance of the value of board, is reduced 
by the smallness of the number employed to a magnitude com- 
parable with the first. 

The whole error is, therefore, by formula +.001. Going to 
the actual figures, we find the estimated ratio of the second to 
the first to be 1.338 to i, and the supposed true ratio to be 

I-33S to I ; that is, the error is -^^ = .002. . . • 

1.338 

The difference between the two methods of calculation is 
then I in the third decimal place, which is accounted for by 
the neglect of the less important terms. 

It is to be noticed that the error in the ratio of two quantities 
is not the same as the error which we might be inclined to 
estimate, the error in the percentage increase. Thus in the case 
just taken, the estimated and true percentage increases are 33.8 
2^"d 33.5, and the error in the percentage increase is .01. For 
accuracy in such calculations, then, we require the error found 
by formula, according to Rule VI., to be very small. 




Biassed and Unbiassed Errors. 

In the consideration of all errors in averaging or comparing, 
it is important to distinguish two classes of errors, those which 
Emnara ^^^ biassed and those which are unbiassed. The 
biaiaodoriin- difference can be made clear by illustrations. If a 
'''^'^ number of men are sent to investigi 
tion of an industry in different places, with a view of proving 
that wages are high, conditions of work healthy, and so on, they 
would probably, by examining only the best conducted works, 
and taking the wages only of the more skilled and regular work- 
men, produce an average for each town which would be too high. 
On the other hand, if there was no brief to be held, but the 
investigation was impartial, the commissioners would in some 
towns take too high an average, in others too low, according to 
their idiosyncrasies and to circumstances. In the first case, the 
errors would be biassed, all in the same direction, all tending to 
increase the average, whose errors would be equal to the average 
error in the different towns. In the second case, the errors 
would be unbiassed, just as likely to be in excess or defect, and 

O 



ii6 
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the more estimates made, the smaller would the resulting error 
be. The following figures would illustrate this : — 



m 


Fact. 


Biassed 
Estimate. 


Unbiassed 
Estimate. 


Average Wages in District — a 

>» » ^ 
if »» ^ 


s. 
24 

23 

26 

27 
28 


s, 

25 

25 

27 
28 

30 


J. 

24 

25 

25 
28 

27 


Averages - - . . 
Errors 


25.6 

• • • 


27 

5.5% 


25.8 

1% 



In measuring the distance of a bicycle ride on a mile-stoned 
road, it is found that the distances between successive milestones 
are not exact, but perhaps 100 to 200 yards out ; but it is nearly 
as likely that the errors will be in excess or defect, and the greater 
the distance gone the smaller will be the error, as defined. The 
errors are unbiassed. If, on the other hand, the bicyclist trusts 
to his cyclometer, he will have to deal with a biassed error, for the 
instrument will not fit the wheel exactly, but will always register 
say 1,800 yards when the machine has gone a mile. This is a 
case where the bias can be measured and allowed for, whereas 
the unbiassed errors must be left to eliminate themselves. It is 
frequently the case that biassed errors are due to a wrongly gradu- 
ated instrument ; unbiassed to separate faulty measurements. 

In the census returns, the fact that many women return 
themselves as younger than their birth certificate states, causes 
a biassed error in the average age of the population ; the fact 
that people frequently return their ages at the nearest round 
number causes ulpbiassed error, and on the whole does not affect 
the average. It is not improbable that in the Wage Census of 
1886-89, there wa]s a general tendency to obtain returns from the 
more liberally and better conducted establishments ; this causes 
a biassed error in the average obtained. With these illustrations 

we cUn pass on to another principle of great im- 

RelatiTe Import- » tt«_« _• j»« ^.i^* • _^ 

anoe oftiassad portatice. Unbiassed errors are of little import- 
and nnbiasied j^jjce kompared with biassed errors in a simple 

arron. ] 

estimate ; but biassed errors diminish when the 
ratio of two similar estimates is taken. 
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For in an average of several quantities, which have biassed errors 
(iv V2 • • •) 2ind unbiassed errors (e^j ^2 • • •)> ^^ ^^ easy to see from 

Rule II. that the resulting error may be written S l^y u~^) + ^ (^- q~^)' 

In the first term, the errors being unbiassed, many of them are 
positive, many of them negative, and they tend to neutralise one 
another; in fact, if £ is typical of the errors e^y e^ . . . ^ then a first 

. E* 

approximation to the error arising from them in the average is '~J-' 

s n 

Thus in the average of one hundred measurements, whose indi- 
vidual unbiassed errors are about — , the resulting error is 

— -5- J 100^ • There is no counterbalancing tendency, on 

10 100 '^ 

the other hand, in the biassed errors; if each estimate was 10 per 

cent in excess, then the average is also 10 per cent, in excess. 

Great effect of When aiming at accuracy our principle always is 

inaaied erron. ^q ^^j^^ ^^^^ ^f ^j^g pounds, and let the pence take 

care of themselves ; and it is quite futile to diminish the un- 
biassed errors, that is to increase the precision of our mea- 
surements, while a large biassed error runs through them all. 
If we do not know of the existence of biassed errors, which 
in reality pervade our estimates, there is no remedy ; if we 
do know of them, we are likely to obtain more accuracy by 
the most erroneous corrections for them than by neglecting 
them; for when we make unbiassed corrections for our biassed 
errors, we reduce them to unbiassed errors, and then the more 
terms we include in our average the smaller is our resulting error. 
If, for instance, we find that the average weekly wage of agri- 
cultural labourers throughout the country is lis.* and by con- 
sidering the circumstances of the thousand r^f"^ whicn »t^ ^itay 
suppose led to this average we have rea^^ to suppose that an 
error of is. would be typical of the^nbiasse" errors in them, 

then an error of-p^,! that is only ^., m^V ^ expected to 

viooo ' -^ 5 ' , 

result in the average. We have here a totally i»usive accuracy ; 



zrzs: f 



* See article cited p. 201, supra^ and Part II., Seo^ ^'j '^^• 
t More correctly the error in the average is as \\)f^'i ^ ^^^ ^^ "^ *s great 
as this, and very unlikely to be much greater. 
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the part of the labourer's income which we have not included, 
payments at haytime and harvest, facilities for piece-work, cheap 
rent for cottage and land and smaller perquisites, is not capable 
of exact calculation. If we omit all these entirely we shall leave 
an error in our average of 2s. or so ; but we make individual 
estimates of these additions, in all the thousand cases, though 
each estimate may be 2s. wrong, if there is no bias, the resulting 

error on the average may be expected to be ~/ -^, that is only ^. : 

our whole error is now not far from id., instead of 2s. In 
estimating the accuracy of published averages, these principles 
should be always borne in mind, and the possibility of biassed 
errors always considered. 

When we are dealing with the errors of a ratio the case is 

quite different. .The error of a ratio is approximately equal to 

Aoonraoyof the difference between the errors in its terms; if 

oompariioiii. ^^ ^1 g^j^^j ^^ ^i ^re the biassed and unbiassed errors 

in the terms, then by Rules I. and V. {'n^ — 'n)+{e^ — e) is the 
error in the ratio. Now the unbiassed error {e^ — e) is likely to 
be of nearly the same magnitude as either e or e^ ;* if, as in the 

above example, e and e^ are unlikely to be much greater than r, 

{e^ — e) would be unlikely to be much greater than - . But 

(v^~v)y the result of the biassed errors, will, if the bias in both 
terms of the ratio was in the same sense (positive in both, or 
negative in both), be less than the original errors. If we have 
made the estimates of both terms on precisely similar methods, 
if we have asked the same questions of the same classes of 
persons, included and omitted the same details on both occasions, 
we shall have made the same errors of bias in both estimates. 
To return /^ sjir previous illustration, if we have made the 
glanng mistake pf omitting everything except average weekly 
wages in the incrome of an agricultural labourer on both occa- 
sions, the only resulting error in the ratio will be that due to 
the change in these extra payments, which in short periods is 
likely to be smalh Or, if we had taken summer wages as the 



• If E is the probable error in e or ^', then E . ,,/2 is the probable error in 
their difference. See p. 305, infra. 
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average for the year in both cases, the error in the ratio will 
depend only on the change in the relation of summer wages to 
that average. Hence the error in the ratio of two estimates 
at different dates of a slowly changing quantity is, if the 
estimates are made on similar methods, often much smaller 
than the error in either estimate singly ; for the unbiassed error 
is little greater, and the more important biassed error is much 
diminished. We need not now know of the existence of the 
biassed errors; they will disappear of themselves. If we are 
aware that there are biassed errors, and have any means of 
making fairly good estimates of them, it will be worth doing ; 
but we shall make a great mistake if we correct the bias in 
one year and leave it uncorrected in another. For purposes of 
comparison it is very seldom of much use and often of great 
disutility to make the later estimate more accurate than the 
Ne«d for unifor- ^^L^''^^''- The error resulting from unbiassed errors 
mity in structure can indeed be diminished a little,* but the error 

resulting from the more important biassed errors 
will only be increased. All Government officials and others who 
compile annual returns are in a dilemma : to make their annual 
statements accurate in themselves, they should always be strain- 
ing after improvements, they should always be watching for 
changes in the quantities measured and adapting their methods 
and tabulations to these changes ; but to make their annual 
returns comparable with each other, they should be absolutely 
conservative, and cling to any mistakes they or their pre- 
decessors have made in the past with all the strength red tape 
can give them, being careful, however, not to add to the mistakes 
or make new omissions. The dilemma can in some cases be 
avoided ; for when an improved method is introduced, the 
tabulation can sometimes be given for a few years both on 
the old and on the new plans ; then when the difference 
introduced by the change is known, the earlier figures can be 
brought to the greater precision of the later. Thus the Board 
of Trade has recently included in the tabulation of exports 
ships which, leaving our shores with merchandise, are them- 



♦ For if E and Ej be typical of the unbiassed errors at the two dates, 
then ^^Ei'-^+E'-* is typical of the error in the ratio, which diminishes with 
either E or Ei. See p. 305, infra. 
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selves sold to a foreign owner ; and we have the following 
tabulation :* — 





1899. 


1898. 


Exports of Home Products 
(exclusive of ships sold to 
foreigners) 

Re-exports of Home and 
Colonial Merchandise 

Total - 
Value of New Ships exported 

New total 


^£255,465,000 
65,020,000 


;£233,359,ooo 
60,655,000 


;£32o,485,ooo 
9,195,000 


;£294,oi4,ooo 
Not stated. 


^£329,680,000 



Boeultg. 



Ignorance of slight alterations in the collection and tabulation 
of material has been the cause of many statistical mistakes. 

To sum up the chief results of this chapter : there are two 
processes which . tend to accuracy — averag'tn^^ which diminishes 

unbiassed errors ; and comparison, which diminishes 
biassed error. TAe errors in weights are seldom 
so important as the other errors which are present in estimates. 
Errors in a result cannot, of course, be calculated, but can be 
expressed in terms of errors in the items, from which it comes ; 
we cannot attain certainty, but we can indicate processes which 
diminish errors, and with the help of mathematics measure the 
extent of diminution. Initial errors are diminished most, when 
we calculate the ratios of weighted averages of similar and 
similarly estimated quantities. Index-numbers, which we dis- 
cuss in the next chapter, are examples of this class. 

The accuracy resulting from the process of sampling requires 
more mathematical treatment, and is dealt with in Part II., 
Section V. 



* Quoted from the Economist, 17th February 1900, in the Statistical 
foumal o{ ^^xch 1900. 
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CHAPTER IX. 
INDEX-NUMBERS. 

The discussion of index-numbers supplies so good an illustra- 
tion of the principles laid down in the last chapter, and index- 
numbers are so important in themselves, that, though it is our 
intention to avoid special questions, it will be worth while to 
devote a short chapter to them. 

Index-numbers are used to measure the change in some 
quantity which we cannot observe directly, which we know to 
Fnnotioii of have a definite Influence on many other quantities 
mdez-nvinMn. ^hich we can so observe, tending to increase all, 
or diminish all, while this influence is concealed by the action 
of many causes affecting the separate quantities in various ways. 
Thus, to take three of the quantities to which index-numbers 
are applied, the change in the relation of the precious metals 
to the work to be done by them affects prices of all com- 
modities, but very many other causes are at work affecting the 
prices of separate groups of commodities ; there are general 
causes tending to raise the wage of a week's work of average 
skill, but this general increase is concealed by numberless minor 
causes affecting different grades of labour in different degrees ; 
the change in the consumption of goods by the working or other 
classes is a sufficiently definite quantity, but it can only be 
measured indirectly by observing the varying changes in the 
consumption of individual articles. 

The use of index-numbers is not, however, confined to these 
instances, but is nearly co-extensive with the field of statistics ; 
for we have limited the term statistics to the measurement of 
complex groups and their changes ; the object of statistics is to 
measure the action of the general laws which govern a hetero- 
geneous group, and the changes produced by general forces can 
be measured, as a rule, only by their effect in individual cases ; 
thus the method of index-numbers is at once applicable to the 
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disentanglement of that which is common to the whole group 
from those variations which are special to individual items. 

The general method of forming an index-number, e.g,, of the 
fall of prices, is as follows : — We select commodities, whose prices 

Mdtiiodof we can estimate accurately, and tabulate their prices 

formatioxL for a series of years. Choosing the prices in one 
year or the average for a sequence of years as a base, we express 
each series of prices as percentages, year by year, of their height 
in the chosen base-year, or their average in the chosen period. 
Then to find the index-number for any year in particular we 
take the average of the percentages in that year. 

The problem, of which index-numbers should give the 

numerical solution, may be compared to that presented to 

ABtxonomioai astronomefs who estimate the motions of the sun 

analogy- by observing those of the stars. As the sun and 
earth move towards some distant point, say in the constellation 
Hercules, the stars have an apparent motion, due to the unper- 
ceived motion of the observer ; those in the region of space 
towards which he is travelling appear to be spreading out, as 
the distances separating them gradually subtend wider angles, 
while those in the region from which he is moving appear to 
close together, and those in directions perpendicular to the line 
of movement appear to move backward. Meanwhile all these 
stars have their proper motions, as rapid as that of the sun, 
but in as many different directions as there are stars. On 
the whole there is a trend in the directions determined by the 
sun's motion, but in individual cases this trend is entirely lost. 
So when a change in the currency has a general influence on 
prices, this influence is concealed by the movements due to 
causes affecting only some of the commodities. In both cases 
it is possible to find the general trend, if sufficient accurate 
observations are available. In both cases the problem is com- 
plicated by the possibility of links connecting the movements of 
groups of the stars or of the prices. 

It has sometimes been supposed that we can estimate the 
effects of general causes directly ; that we can, for instance, obtain 
indez-nninben ^n objective measurement of the change in the pur- 
and gampies. chasing power of gold, by evaluating it at two dates 
in terms of all commodities purchased, weighted by the amount 
spent on each ; but it is better to neglect this method at once 
both as impracticable and as not answering the purpose of index- 
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numbers, for the effects of minor causes affecting separate com- 
modities would not then be necessarily separated from the main 
cause. 

Suppose that the changes in a group of quantities are deter- 
mined by one general force which acts on all in the same sense, 
that is, tends to increase all or decrease all, and by several other 
forces each of which acts on one or more of the quantities, and 
some of which tend to increase, others to decrease the quantities 
they affect ; then of the special forces, some will tend to increase, 
others to diminish the average, while the general force will 
have a cumulative effect entirely towards increasing, or entirely 
towards diminishing it. If the separate effects of the special 
forces are small compared with their number, they will tend to 
neutralise one another in their influence on the average; and 
the change in the average will show the influence of the general 
cause only.* In the language of the last chapter, the special 
forces produce unbiassed changes, which are negligible in their 
effect on an average, in comparison with the biassed changes 
produced by the general force. To obtain this elimination it is 
necessary to take random samples, so that the laws of probability 
may have free action ; and the two questions to be discussed are 
the choice of samples, and the choice of weights to be applied to 
them. , 

As we have already seen, the effect produced by varying the 
system of weights applied to so few as 30 or 40 numbers is 
unimporunoe very slight, and the error resulting from errors in 
of weights. weighting is in many cases much smaller than 
the error resulting from faulty measurements of the quantities 
weighted. We shall presently show f that the precision of an 
average increases with the number of like quantities averaged. 
From these principles it is clear that it is more important to 
increase the number of our samples than to attempt accurate 
calculations of the proper weights to give them. 

The choice of samples is in practice very much limited, for 
in calculations extending over long periods we are dependent on 
the accidental preservation of records ; and when we have taken 



* This abbreviated statement should be criticised in the light of Part II., 
Sect, v., tn/ra. See also Report of Committee on Variations in the Mone- 
tary Standard^ British Association, 1888-90. 

+ See p. 305, injra. 
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into our reckoning all the measurements which can be accurately 
made, the number of samples barely comes up to the minimum 
necessary for the normal action of the laws of probability. 

There are many index-numbers of wholesale prices extant, 
some of which we may pass in review. The Board of Trade 
The Board of publish the recorded quantity and value of goods 
Trade index, imported and exported, and the average prices of 
these goods can be calculated. Those commodities are selected 
which occur in the returns for the whole period chosen. A 
particular year is chosen as base ; then the goods are valued 
in all other years separately at their prices in the base year ; 
the total of these values in any year is the sum which the| 
goods would have been worth if their prices had remained 
unchanged; the ratio of this value to that actually recorded 
is the ratio of their average price in the base year to their 
average price in the other year selected (if the term average 
is used broadly), and if the first term of this ratio is equated 
to lOO, the second term is the index-number required for the 
year selected, expressed as a percentage of the number for the 
base year. It is at once evident that we are here dealing with 
weighted averages. 

Let /i, /2> A • • • ^ ^^^ prices in the base year of units 
of the goods selected, and r^fi^, r^Pzf ^sA • • • ^^^ prices in 

Systems the year for which we require an index-number : 
of weighu. then fi, ^2, ^3 . . . measure the changes of prices 
for the separate commodities, and these r^s are the samples 
from which we are to deduce the general change of price. 
The weights used in the process described may be found 
thus: let 6^^ ^2» ^s • • • '^ the numbers of units of goods in 
the selected year ; then the total value in the selected year 
at the prices of that year is (^i^iA+^2'2A+ • • •)> 3-"^ at the 
prices of the base year is (^^^j 4-^2/2+ • • J the ratio is 
'26rp : 2^/, and the index - number for the selected year is 

Here the weights applied to the r^s are the values which the 
corresponding goods in the selected year would have borne at 
the prices of the base year. It is clear that the selection of the 
standard year affects the weights, for any particular commodity 
can be given special weight by choosing as base a year in which 
its price is high, and much trouble has been spent in searching 
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for a " normal " year ; but though the weights of separate com- 
modities are affected, it does not follow that the average will 
be altered, and we should expect from the principle laid down 
above that the change would be very slight. In fact we have 
the following figures : — 



INDEX NUMBERS OF 1886 AND 1883 COMPARED.* 


Imports. 


Exports. 


Weights. 1 


Values at 

1873 
Prices. 


Values at 

1883 

Prices. 

1 


Values at 

i86x 

Prices. 


Values at 

1881 

Prices. 


Values at 
Prices. 

1 


Values at Values at 

1883 ; 1861 

Prices. ' Prices. 


Values 
at x88x 
Prices. 


1883 
1886 


100 
81.7 


[ 100 

82.1 


100 
82.9 


100 
82.3 


100 

88i 


100 100 

88 ' 87 


100 
89 



It is possible to produce figures which show a variation 
caused by a change of base year, but it is done by choosing 
samples which lend themselves to the special argument 

Since so great an alteration in choice of weights makes so 
little difference, it is worth while to see if we need even keep 
the weight due to the quantities imported (the ^'s in the above 
formulae). The following table may be quoted f to show that 
these weights even have practically no influence : — 

ItideX' Numbers for 1895, ^^^^ that of \%%\ is 100, obtained by 

Various Systems of Weighting, 





Ratios of Pricks (rj, rg . . .) 


Reciprocal 
of A.M. 

of i-. '- . 


Economhfs 
Figures. 


WeiKbted 
by Values 

of 189^ 

Quantities 

at 1881 

Prices. 


Weighted 

by Declared 

Values in 

1881. 


Arithmetic 
Mean. 


Median. 


Geometric 
Mean. 


Imports 
Exports 


67J 
83 


69 
87 


73J 
82 


72J 
81 


72J 
784 


69 

7S 


• 71 



♦ From the Economic foumal and the Statistical foumaly both June 
1897. 

t From the Economic foumal (with a correction in the statement of 
weights). 
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In the first column of figures the goods of 1895 {b^, b^, b^, . . 
units at prices p^^ p^y p^ . . .) are valued at the prices of 
similar goods in 1881 (/^ p^, p^ . . .)• The ratio of their 

new to their old value ^^ "^^V'yAiJ ^^ ^^^ ^^^^^ ^^ ^^^ "^^ 

index - number to 100. In the next column the index- 
number is obtained by valuing the quantities of 1881 at the 
prices of 1895 ; then the ratio of the new value to the old 

(-^^ (where aj^ a^. . . are quantities in i88i) = 2r.^^j is the 

ratio of the index-number of 1895 to 100 for 188 1. In the 
next three columns the arithmetic mean, the median, and the 
geometric mean of the r^s are given. In the last column but one the 

arithmetic mean of — , — .... that is of the ratios of the 

prices of 1881 to 1895, ^s calculated, and the ratio of this 
mean to 100 equals the ratio of 100 to a new index-number, 
which corresponds to the former arithmetic mean with the 
years 1881 and 1895 interchanged. The figure in the last 
column is calculated from material given in the Economist /f 
every year the imports and exports are valued at their prices 
in the previous year, and thus an annual ratio is given similar 
to that in the first column of figures in the table just given ; 
the number 100, taken for 1881, is multiplied by this annual 
ratio year by year till 1895, and the number 71 is the result 
[Algebraically this index-number is — 

A more complete analysis of these figures, and an investiga- 
tion as to the causes of the divergence between the export 
indices 87 and 75, would show which of the methods should be 
adopted. Here we will be content with noticing that the 
unweighted average, 82, is very near the first weighted 
average, 83. 

Further methods of dealing with such weights are given on 
p. 226, under Retail Index-Numbers. 

* Where ri=^, r^^^, &c. 

Pi /« 

f See, for instance, the quotation from the Economist in the Statistical 

Journal^ 1900, p. 139. 
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The advantage of index - numbers on the Board of Trade 
basis is that they measure approximately an objective quantity, 

Objeotive ^Lnd a result is obtained which can be stated in 

"*•■'"*"• terms which appeal to the ordinary man who is 
not a statistician: such as, "The imports of 1895 would have 
cost half as much again if their prices had been those of 1881 ;" 
but, as pointed out above, it does not follow that this index 
is the best measure of the less-definable quantity, " Fall in the 
price of imports," where we imagine a general cause affecting 
this class of commodities whose action is modified by other 
partial causes. 

A special advantage of the geometric mean* is that the 
results it gives are independent of the year chosen as base ; for 

Qeometrio ^^ Pv A» • • • A and p^\ p^\ . . . pr} are the prices 



in two years, V/iA • • • A '"'JPiPi . . . A^ = • ^^o : Ip 
the required index-number ; hence — 

A? 
A 



Ij=IO0 



' V/i A 

V A A A V A' Pi 



= 100 X 



I X 7^ ^^" 



which would be the value obtained for the third year if only 
the second and third were considered. Considering the extra 
labour involved in calculating this mean, and the small advantage 
obtained by any alteration in the weighting, its use is not to 
be generally recommcnded.f 

Mr Sauerbeck and the Economist both avoid in part the 

difficulty of weighting the separate ratios by their relative im- 

othor Index- portance in consumption, by selecting from those 

Bombort. commodities whose prices are most accurately 
determined more instances of such widely consumed articles 
as wheat than of less important commodities such as linseed. 
Mr Sauerbeck has, in his annual articles in the Journal of the 
Royal Statistical Society ^ verified the correspondence of the un- 
weighted average of his 56 ratios with the average of the same 
weighted on various principles. 

* Pointed out by Professor Edgeworth. 

+ On this point, and on others in this chapter, see article Index- 
Numbers, in Palgrave's Dictionary of Political Economy, 
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While the choice of the special weights to be employed is, 
when the number of ratios taken is at all considerable, quite 
importanoe of unimportant, the choice of the quantities dealt 
right choice with has great effect on the result Thus import 
of samples, figures, relating to raw materials and the produce 
of other countries, do not lead to the same index-numbers as 
export figures dealing with the price of our own produce, 
though th6 tables just given show that they are little affected by 
weights ; and neither of these agree closely with Mr Sauerbeck's 
or the Economisfs numbers, and these again are not in complete 
agreement The samples on which these four sets of numbers 
are based are from different groups of commodities, and the 
numbers show that the same forces do not affect these groups 
in the same degree. When we have so multiplied our samples, 
that we can subdivide them without affecting the index-numbers 
deduced, we may expect our results to represent the required 
measurement* 

If we compare the Economist index-numbers with Sauer- 
beck's during the period 1860-70, we see that the former show 
Great advantage a very much greater increase during the cotton 
of the median, famine than the latter. An index-number which 
can be greatly disturbed by fluctuations, however violent, in 
only one group of commodities, is clearly wanting in some of 
the chief qualities of a general jneasure of price levels. A very 
simple means of avoiding this difficulty, and indeed all the 
intricacies of weighing, is to take the median of all the price 
ratios of a particular year as the index-number of that year. 
It is perhaps impossible to show theoretically that any other 
average satisfies the required conditions better than the median, 
and there can be no doubt that it is practically the easiest to 
calculate. 

If, on the other hand, paucity of data makes the inclusion of 
weights necessary, and the popular desire for concrete measure- 
Proposed ments makes a fine show of weighting expedient, 
standard. ^g perhaps cannot do better than to adopt the 
standard proposed by the Committee of the British Association, 
already mentioned, for the construction of an index-number, 



* Mr Sauerbeck's numbers are to be found in annual articles by him in 
the Statistical J ourfial ; and a diagram showing them from 1820 is pub- 
lished by Effingham Wilson (is.). 
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which might be the basis of business transactions involving 
future payments. This standard is as follows : — 

Basis of Index- Number recommended by the Committee appointed by 
the Economic Section of the British Association^ 1888. 



Articles. 



Wheat - 

Barley 

Oats 

Potatoes, rice, &c. 

Meat 

Fish 

Cheese, butter, milk 

Sugar 

Tea 

Beer 

Spirits 

Wine 

Toliacco - 

Cotton - 

Wool - 

Silk 

Leather - 

Coal 

Iron 

Copper - 

Lead, zinc, tin 

Timber - 

Petroleum 

Indigo - 

Flax and linseed 

Palm oil 

Caoutchoux 



Estimated 
Expenditure 

per Annum 
on each. 
000,000's 
omitted. 



30 
50 

50 
100 

20 

60 

30 
20 

ICO 

40 

10 
10 
20 

30 

20 

10 

too 

50 
25 
25 
30 

5 

5 
10 

5 
5 



Hence 
Weights 
assigned. 



.20 



20 



7i 



'20 



10 



10 



Prices to be taken from 



Gazette average, English wheat. 
n » barley. 

n t oats. 

Av. import price, » potatoes. 
Market quotations, live meat, 

Smithheld. 
Board of Trade Returns-; aver- 
age per cwt. landed. 
Cheese and butter, average im- 
port price. 
Av. import price, refined sugar. 

tea. 



m 
m 

» 

m 
m 



export 
import 

m 



m 
m 

m 



export 















beer. 

spirits. 

wine. 

tobacco. 

cotton. 

wool. 

raw silk. 

hides. 

coal. 



Market price, Scotch pig-iron. 
Av. import price, copper ore. 

// // lead ore. 

Average import price. 



m 







m 













Since we can only obtain rough correspondence in dealing 
with wholesale prices, we cannot expect to be able to measure 
Batau prtoa retail prices with any great precision. For we saw 
Index. jj^ ^jjg preceding chapter that the error in an aver- 
age bears a definite relation to the errors in the items which 
compose it ; if the errors in the items are on the whole doubled, 
it is likely that the errors in the average and in the ratio of two 
averages will also be doubled, and we shall need four times * as 
many samples to restore the precision. Unfortunately the 

* See p. 305, infra, 
P 
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material for computing a retail index-number is even more incom- 
plete than that for wholesale prices, and owing to the smaller 
number of articles that can be included, and the preponderance 
of such items as bread and rent, the question of weighting 
becomes of more importance. 

When we wish to construct an index-number to show the 

purchasing power of money to special classes, we must take 

spedAi into account some considerations which can be 

diffiooitios. ignored when dealing with wholesale price num- 
bers. Different classes of persons at the same time, and the 
same classes at different times, spend their income in varying 
proportions on different objects. If we could collect enough 
sufficiently accurate samples, this fact would not matter so much ; 
but it would still be of some importance owing to the tendency 
to make increased purchases of cheapening commodities. As it 
is, it would be necessary to construct separate index-numbers for 
each class and each district. The difficulty of insufficient and 
inaccurate data cannot at present be overcome ; but as it is pos- 
sible that we may in the future get definite records of retail 
prices sufficiently numerous to make up for their want of pre- 
cision, we may glance at the other details of the problem. To 
form an index-number for a particular class of people, we need 
records of the method of expenditure of their income at all 
the dates in question, of sufficient numbers to obtain the slight 
precision which weighting needs. Then if we had fairly good 

Meuiodi of records of retail prices several methods of weight- 

weightiag. jpg j^j.g open to us,* all of which are likely to give 
nearly the same result. The necessity of weighting and the 
methods are best shown by a numerical illustration. 

Suppose the following records of expenditure : — 



First Year. s. d. 

6 quarterns bread at 6d. 3 o 
4 lbs. meat at 7d. - 2 4 

\ lb. tea at 3s. - -16 



6 10 



Second Year. j. tL 

7 quarterns at 5d. -211 

5 lbs. at 8d. - - 3 4 

i^ lbs. at IS. 4d. -20 



8 3 



The second year's budget at the first year's prices would cost 
los. I id. ; index-number of retail prices on this basis — 

♦ See article on Wages, Nominal and Real, in Palgrave's Dictionary of 
Political Economy^ pp. 640-641. 



INDEX-NUMBERS. 227 

The first year's budget at the second year's prices would cost 
5s. lod. ; index-number on this basis loox 55^^—^ = 854 {b). 

The ratio of the costs of 6^ quarterns, 4 J lbs. meat, i lb. tea, 
the averages of the quantities in the two years, at the two sets 
of prices, is 8s. lojd. to 7s. oJd.= 100 : 73.2 {c). 

If we disregard the records of changing expenditure, we 
find that the unweighted average of the three ratios of prices is 
100 : 80.8 {d). 

If we suppose the second sum (8s. 3d.) to be spent on bread, 
meat, and tea in the same proportional parts as the first sum 
(6s. lod.), we have — 

f. ^, of 8s. 3d., />., i^d. would have bought '^ quarterns at 

OS. lOQ. ^I ^X X ^ 

the second price, which would have cost '^ ^ x 6d. at first price. 

Working out the other items in a similar way, we find that 
the second sum distributed in the same proportions as the first 
would have bought goods which would have cost los. lojd. at 
the first prices ; and the resulting index-number is — 

^"''^- of 100 = 62.8(4 
I OS. loid. "^ ^ 

This reduction is due to the large expenditure on bread on 
this hypothesis, which can easily be shown to be an unreasonable 
one if we suppose the price of bread to be reduced to nothing, 
while the other prices rise ; then on this hypothesis the fall is 
infinite. 

Of the above numbers, {c\ (rf), and {e) do not seem to rest on 
sound hypotheses ; {a) clearly overstates and {b) clearly under- 
states the fall ; and therefore some number between {a) and (Jb) 
is the number required. If {a) and (J?) lie close together there 
is no further difficulty ; if they differ by much they may be 
regarded as inferior and superior limits of the index-number, 
which may be estimated as their arithmetic mean (80.5) as a first 
approximation. 

While it is useful to have a definite means of calculating 
these numbers to bring extravagant statements to a numerical 

Fnrtiior test, there are two further considerations which 

difflooitiM. hinder the complete solution of the problem. In 

all budgets rent is an important item, and there seems no 

prospect of obtaining any good estimate of the relation between 

increasing rent and improving accommodation, allowing for the 
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benefits of public expenditure paid by rates included in rent. 
Again, if we consider, not how money is spent, but how it might 
be spent, we should have to introduce a more general factor ; 
for the margin which remains when necessities are satisfied has 
a rapidly growing purchasing power, as the products of machinery 
increase in variety and diminish in price ; perhaps the calculated 
fall in wholesale prices forms a fair measure of this growth. 

Leaving this somewhat unfruitful topic, let us return for a 
moment to the measurement of a quantity more typical of 
index-numbers.* If we have to measure the action of a cause, 
index-mimberg which affects quantities which have no common 
of oonBumption. measure, we are still able to apply index-numbers. 
A general increase has taken place in the consumption of 
imported goods, and if we can measure this increase indepen- 
dently of any change in price, we can use it as an argument to 
support the alleged increase in real wages. The only common 
measure of bread, currants, cheese, meat, &c., of practical value is 
their price, their weight being useless for the purpose. If the 
quantities consumed year by year of a number of such com- 
modities are written down, expressed as percentages of the con- 
sumption in any years (not necessarily the same), we have series 
of numbers which only need weighting to form the index- 
number required. We can in this case verify, that any logical 
choice of weights, based on their value or their assumed im- 
portance, or even a random system of weights, gives much the 
same index-number as the simple arithmetic averages; in fact, we 
have a sufficiently good group of samples to render us nearly 
independent of weights. When this is the case we can say with 
safety that the number required lies in the neighbourhood of the 
group given by the various systems of weights, and choose what 
appears the most logical system for the estimate we adopt. In 
the paper referred to, five different systems applied to only 
fourteen commodities give results for the increase of consump- 
tion all between 13.8 and 20.1 per cent, in the period 1873-96. 

The application of index-numbers to wage statistics does not 

involve any fresh principles. It is not permissible to ignore 

Wage Index- weights in this case ; for an unweighted average 

numbers. would not allow for the general tendency to in- 
crease numbers where wages are rising. There is great liability 

* The following illustration is based on Mr G. H. Wood's paper on 
Sotne Statistics of Working Class Problems^ cited above. 
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to " biassed " errors in separate averages ; for wages for over- 
time, specially high piece-wages, wages of large uncombined 
classes of low-skilled or badly paid workpeople, may often be 
omitted in wage records. These biassed errors, however, tend to 
disappear in comparison ; and it may prove possible to construct 
a wage index-number of very fair precision. 



CHAPTER X. 
INTERPOLATION 



CHAPTER X. 
INTERPOLATION. 

I. General. 

It is very often the case in practical statistics that we are not 
able to make serial estimates as frequent or descriptions of 
NooMftty of groups as detailed, as is necessary for their use in 
interpoiAttoB. further investigations. Thus the population is 
only counted once in ten years ; but we need to bring monthly 
and annual accounts — births, deaths, trade returns, &c. — into 
close relation to the existing number of people, and estimates 
for the budget and the yield of taxes must be based on the 
assumed number of taxpayers for the current year ; it is 
therefore necessary to interpolate estimates for the number of 
the people in intercensal years. Again, interpolation is needed 
for the statement of the distribution of the population according 
to age, a tabulation which is necessary for actuarial work and 
for sociological purposes. The ages returned on the house- 
holder's schedule are nominally correct to the year, but in 
practice they are known to be inaccurate, tending to group 
themselves in the neighbourhood of round numbers ; but the 
returns for such age periods at 35-45 years are more correct, since 
the persons who return themselves as 40 years old are probably 
within 5 years of that age. The original returns are so erroneous 
that they are not published at all, but the numbers are only 
given in the ten-yearly periods ; from the numbers so given, it 
is necessary to estimate the numbers for the individual years. 
Again, the compilers of the wage census of 1886-91 enumerate 
the numbers earning wages "of 15s. and under 20s.," "of 
20s. and under 25s.," and so on, but not the numbers in 
shilling limits. In problems relating to wages we often need 
more detail ; and when we are comparing these wages with a 
similar group in France, we must devise a scheme by which 
grades of 2 francs can be compared with grades of 5s., by a 
suitable system of interpolation. Such a necessity is very 
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common when we wish to compare groups, which are similar 

but tabulated on diverse systems. Thus, two countries conduct 
their census at different dates. In one country the age groups 
are of fifteen years, in another of ten ; in one, ** young persons " 
are those under 21 ; in another, those under 18. Occasional 
estimates seldom correspond in date ; wage statistics are found 
for 1840, 1850, and 1892 in France, and for 1866, 1885, 1886, 
and 1 89 1 in England. Similar differences are found when we 
are comparing county with county ; and a discussion of the 
method of determining averages in such a case will illustrate 
some of the elementary problems of interpolation. 

Suppose that the figures printed in Roman type in the 
Biementary following table are accurate returns of the weekly 

ezamidA. wages in three districts, and that we wish to find 
the average change in the three together. 



Yean. 


i860. 


186a. 


Z864. 


1866. 


X870. 


i87<. 


1875. 


X878. 


z88o. 


i88z. 




X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


X. d. 


District A 


12 6 


15 


'5 


IS 


15 


14 6 


18 


18 


17 6 


17 


IT B 


18 


79 


19 


20 


20 


ig 6 


21 


21 


20 6 


20 


C 
Average 


10 


II 


II 


12 


12 


12 


IS 


rS 


15 


14 6 


13 6 


/5 


/5 


IS s 


'S s 


'5 4 


18 


18 


17 8 


17 2 



It is clear that there is something to be learnt about the 
general course of wages from the data, but the lessons are not 
obvious. The following figures, printed in the table in italics, 
are those which naturally suggest themselves. There is no sign 
in A of any change between 1862 and 1866, so we write ijs. for 
1864. Judging from B, the figure for 1870 is not likely to have 
been lower than that for 1864, so we write i^s, for A in 1870. 
A is now complete ; we notice that in A the first rise was com- 
plete by 1862, and assuming the same in B, we obtain igs. for 
1862. In C there is a rise between 1864 and 1866, while in A 
there is no change from 1866 to 1870 ; B will correspond if we 
write 20s, in 1866. If we write for B, i^s. 6d, in 1871, 21s. in 
1875, and 20s, 6d, in 1880, we shall have close correspondence 
with A from 1866 to 1881. Similar reasons lead to the numbers 
interpolated for C. The unweighted average can then be cal- 
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culated year by year, which could not be done directly from the 
data. This average reflects all the changes in the original figures 
and gives no special predominance to any. It may be regarded 
as the most probable series that can be based on the given 
information. 

We will now notice the assumptions tacitly made in pro- 
ceeding by this method. First, it has been assumed that there 
AmmpuoBi are no sudden jumps, that such a figure as 20s. 
■•^ for A 1864 Js inadmissible ; this is only justifiable 
if we are acquainted with the general causes which influence 
the rate of wages, and know that there was no violent disturb- 
ance in the intermediate dates. We could not make this 
assumption as to wages in the cotton trade in the time of the 
American Civil Wars, nor can we make it over a long series 
of years. Secondly, it has been assumed that in the absence 
of evidence to the contrary the rise or fall has been uniform. 
Thus, in B 1878-81, the wage in 1880 is assumed to be inter- 
mediate between 1878 and 1881 ; if there had been no indica- 
tion from A that it was half-way between in point of wages, 
it might have been said that in point of time it was two-thirds 
of the way, and 20s. 8d. should be interpolated for 1879 and 
20s. 4d. for 1880, if it was worth while to depart from round 
numbers. Thirdly, it has been assumed that the course of 
wages in the three districts was similar. Thus in A there is 
a rise from 1860-62, but there is no further improvement at 
any rate before 1866; it is consequently assumed that the rise 
registered in B and C before 1864 actually took place before 
1862. Again, when considering the period 1870-75, we notice 
that in A there is a fall till 1871, and a sharp rise to 1875, and 
no change to 1878 ; in B, therefore, it is assumed that the wage 
of 1875 is equal to that of 1878, and the fall in 1878 may be 
allowed because it increases the sharpness of the rise in 1871-75. 
In C it is doubtful whether the 12s. in 1871 should not rather 
be IIS. 6d. The reasons against are that a gain on a low wage 
is often not so easily lost as a gain on a high one ; 6d. is a larger 
drop proportionately on 12s. than on 15s. ; that the rise of 3s. 6d. 
which would then be shown 1871-75 is a larger proportionate 
rise than in either A or B ; and that the existence of the fall in 
1870-71 depends only on the evidence of a fall between 1866-71. 
When the figures are few in number, it is necessary to examine 
them in this way to pick out the most probable ; and it is often 
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fairly easy to fill in the figures which satisfy all the existing 
evidence fairly closely. 

The question at once arises, What certainty have we that 
these quantities, by hypothesis unknown, are in reality anywhere 
near the figures which on the face are most probable ? 

In some cases of interpolation, dealt with presently, the 
answer can be given as a statement of mathematical proba- 

bility, such as : it is 2 to i against a divergence 
of 6d. from the assigned figure, 30 to i against 
one of IS., 1,000 to i against one of 2s. 6d., and so on ; but 
in the figures most often cropping up in investigations it is 
not possible to assign such a precise probability. There is 
one rough but useful way of testing the accuracy of such 
interpolation as in the case before us which can be explained 
by an example. Test how far we can throw out our calculated 
average for 1870, without violently infringing the common-sense 
of the question. Make A and C as large as possible in these 
dates ; we may perhaps suppose a rise of is. above 1866, seeing 
that there is one in B between 1864 and 1870. We can hardly 
suppose either that 1870 is as high as 1875-78, or that there is 
a great drop of as much as 2s. in the single year, if we are 
acquainted with the causes that determine the wages at those 
dates. Let the highest wage we can assign to A and B be 
1 6s. 6d. and 13s. 6d. respectively. Our average is then i6s. 8d. 
instead of 15s. 8d. Similarly, we might perhaps think that 
14s. and IIS. were the lowest possible in A and C in 1870; 
then the average would be 1 5s. Assuming that we know enough 
about the general trend of events at these dates to assign limits 
in this way, we can say it appears improbable that the average 
wage in 1870 was less than 15s. or more than i6s. 8d., and that 
the evidence points to 1 5s. 8d. 

The accuracy of our interpolation then depends — (i) On 
knowledge of the possible fluctuations of the figures, to be 
obtained by a general inspection of the fluctuations at dates 
for which they are given ; (2) on knowledge of the course of 
the events with which the figures are connected. 

Numerioia A second example of a similar kind* may be 

example. given to illustrate the numerical calculation. 



* Taken from Agricultural Wages in England^ in the Siatistical Journal^ 
December 1898, by the present author. 
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Northern Counties. Weekly Agricultural Wages in 

i867-69. 1869-70. 

s. d, s. d. 



Cheshire 

Lancashire - 

West Riding of Yorkshire 

East „ „ 

North „ ,, 

Durham 

Northumberland - 

Cumberland 

Westmoreland 



/J / 13 6 

15 o 15 o 
14 6 16 5 
14 6 t4 II 
14 6 15 4 

16 6 16 o 
16 6 16 7 
14 4 14 9 



15 7 16 I 

Roman figures given. Italic figures interpolated. 

The averages of the wages in the five districts for which 
data exist in both periods are 15s. 4.8d. in 1867-69 and 15s. io.4d. 
in 1869-70, that is in the ratio 33 134. If we assume that the 
wages in the other counties have been influenced by similar 
causes and increased in the same ratio, we obtain the figures 
interpolated in the table. The unweighted averages for the 
northern counties are now 14s. iid. and 15s. 5d. in the two 
periods, instead of 15s. 3d. and 15s. sd., the averages of the 
given numbers. For general comparison all over England 
between these two years we should have been obliged to neglect 
the missing counties in both years, which would have unfairly 
lowered the general average, since these counties have in recent 
times had wages above the English average though below that 
of the northern district. At the same time we should have 
unfairly raised the apparent average of the northern district. 
We should also have lost the probable figures for the special 
counties at the earlier date which are on a fairly safe basis ; 
for the wages in these counties of the Northern District remain 
in nearly the same order through the last fifty years. At the 
same time it is easily seen that these wages are not so accurately 
known as those not interpolated, and it is well to notice in 
arguments based on such figures, to what extent the interpolated 
figures are involved. 

A process very similar to that just employed is used in 
giving marks at school to students who arc absent from a lesson ; 
attention is paid both to the particular student*s general place 
in the class order, and to the average value of the marks obtained 
by the rest of the class in the lesson missed. 

Though the method be fairly complete it is very important 
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to notice that interpolated figures rest on quite a different class 

of evidence to those which are the result of direct 
Vtoaultr tor . , , , 

diiungnidiiig evidence. In some cases they may represent 

""te™*** quantities which have no existence (as in the case 
of school marks) and which are only used for con- 
venience of calculation. In others they are simply figures 
adopted as those which in default of definite knowledge appear 
most probable. They must always be clearly indicated as inter- 
polations ; it is always well to state the method by which they 
arc obtained, and any subsidiary information which may be re- 
garded as direct evidence of their accuracy, and if practicable 
they may be given not as exact, but as lying between certain 
limits; thus the interpolated figures for Cheshire might be 
written t2S. 6d. to ijs. 6d., instead of 13s. id. 

Several different cases are met with in interpolation, some of 
which are treated algebraically in the next section, while others 
can be illustrated at once by numerical examples. 

The Graphic Method. — If we know the values of quan- 
tities at isolated positions, such as the numbers of the population 

Qtifbia Kt the ages 25 to 35, 35 to 45, &c. ; the population in 

■"^■^ 1871, 1881, iSgi.&c; wages in i860, 1870, 1873, 
&c.; the numbers whose wages are from 15s. to 20s., 20s. to 253., 
&€., we may represent the facts by such a diagram as — 



Years 1E60 1S66 1870 1877 1880 1S84 

Suppose that we need the value of the quantity in 1875. If 
we were only given the two points C and D, the simplest 
hypothesis, and the one to be made in the absence of any 
evidence to the contrary, is that the quantity increased uniformly 
between c and D ; representing such an increase by the straight 
line C D, the height of the point x will represent the quantity 
in 1875. 
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If the point E is also given, the hypothesis represented by 
the straight lines C D, D E will not stand, for it assumes a sudden 
break in the regularity at the point D in 1877, for which there 
is no evidence. We must take into account all the points given, 
and through them all a line must be drawn whose curvature is 
as smooth as possible, for in the absence of evidence to the 
contrary, sudden changes in the quantities may be assumed not 
to exist. Such a curve can be constructed on mathematical 
prindples, or may be drawn freehand ; if the latter, it will often 
be quite as near the facts as the argument will allow us to go. 

This method only applies to continuous quantities, such as 
numbers at different ages, population at different dates, earners 
at diflferent wages in a very large group of wages. Thus for all 
England the average wage must change gradually, but the wage 
of the London builders changed suddenly as the result of 
strikes and arrangements at certain dates. In this case we 
must draw the figure to correspond as closely as possible to 
the evidence, such as — 



where A B represents a sudden rise ; B c a gradually accelerated 
increase due to improving trade, c D a slow falling off from 
the wage reached at c, and D E a determined and successful 
effort to recover the lost ground. 

Periodic Figures. — If we know the annual averages of 
figures which have a yearly period and a sufficient number of 
monthly averages to estimate the periodic fluctuations by the 
method described on pp. 182-7, ^^ ^^^ interpolate figures for any 
month for which the returns are incomplete with fair accuracy. 
Thus if we are dealing with the numbers of unemployed as given 
in the Labour Gazette, we find a periodicity which is not very 
strongly marked in alt the months, but there is in general a fall 
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in the spring and a rise in the late autumn, and June is generally 
the minimum month. We can then make use of the small 
diagrams on p. 184, and, having marked in all the information 
we have, draw the waves on the rising, stationary, or descending 
line of averages, so that the fluctuating lines shall pass through 
all the given points. We can obtain an idea of the accuracy 
of the resulting figures by noticing the general characteristics 
of the given figures ; we find that the percentage unemployed 
has never changed more than two units in one month, that 
there are no fluctuations which have lasted less than three or 
four months, and that the percentages have never been below 
I or above 10. Finally, we can look at the trade history of 
particular dates, and in the light we thus obtain reject any 
improbable figures. 

Use of Subsidiary Curves. — If we are able, by the 
methods described in Chapter VII., Sect. III., or Sect. V., 
to find a close connection between two series, we can use the 
more complete of them to assist the interpolation of any missing 
figures in the other. We must first investigate carefully the close- 
ness and nature of correspondence at the dates for which we 
have complete figures in both series. Then we can draw dia- 
grams, similar to those facing p. 175, one of the lines being 
incomplete. Then completing the broken line, so as to bring 
it into ;as close resemblance with the completed line as the given 
points allow, we shall obtain the most probable values for the 
missing figures. The accuracy of the result can be tested as 
in the previous case. This method may reasonably be used 
in interpolating figures for the yield from one source of revenue 
by means of the yield from another ; for the value of exports 
from that of imports ; for the marriage rate from foreign trade ; 
for the wages in one district from those in another ; for the 
number of unemployed from the changes in consumption of 
foods ; for changes in parts of the population, when we know the 
changes in the whole, and for many other series. 

General. — Series of figures may be classed in three groups — 

(i) Periodic ; (2) symptomatic, where there is a general tendency 

oenerai oiassi- towards increase (as in serial wage statistics) or 

fioationofBeries. decrease (as in the English birth rate in recent 

decades) ; (3) those which have no period and no symptom, but 

only apparently random fluctuations. 

To interpolate figures in series of the third group it is neces- 
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saiy to obtain a measure of the fluctuations, by the theorems 
of Part IL, and then we can assign the mathematical proba- 
bility of the various numbers possible. In series of the second 
gfroup, we must pay attention in addition to the symptomatic 
tendency. The necessity for interpolation of this kind does 
not, however, arise frequently, so we will not offer detailed 
illustrations of it. 
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Section 2. — Algebraic Treatment. 

The problem of interpolation to which most attention has been 
given may be stated as follows : — When one quantity is subject 
to continuous regfular change, and a second quantity changes in 
connection with it, and we know or can estimate directly only 
some discontinuous values of this second quantity, it is required 
to find the probable values of the second quantity which corre- 
spond to given values of the first : for instance, given the expec- 
tation of life at the ages 15, 20, 25, &c., it is required to find it for 
intermediate ages ; given the population of the country in 1871, 
1 88 1, 1 89 1, 1 90 1, find it at intermediate dates. The only per- 
missible assumptions are that the quantity changes continuously, 
that is with no breaks at any figure, and that the rate of change 
of the quantity is also continuous, that is that the line represent- 
ing its value is not angular, but smooth. This problem differs 
from those just discussed, in that there is likely to be a law 
binding such figures together, whereas in the former cases the 
consecution was apparently random. 

It is necessary to divide the problem into two classes : — 

A. Where the given values may be assumed to be accurate ; 

B. Where the given values are liable to correction. 

A, Some preliminary algebra is necessary ; it is derived 
principally from Boole's Finite Differences and De Morgan's 
Differential Calculus^ to which authorities readers may be referred 
for more detailed treatment. 

I. Let ^ be a continuous function of or, and let ^^'o, ^11^2 • • • 
be any values oly. 

Let 4>\ Aji, Agi . . . be written for y^ -y^ y^ -y^^ y^ -y^ . . . 

4>8, A,2, A^^ . . . „ V-A,i, V- V. V- V. • . • 

V, V, V . . . „ V-^o^ V- V» V- V. • • • 

and so on. 

A„i, Aji, ... are called differences of the ist order; 

Ao^j Aj2j ... „ „ 2nd order, and so on. 

It is easy to show that A^^ ^y^ -2y^ -k-yoy A^^ ^y^ -2y^ -^-y^, &c. 

^ =>'8 - zy^ + zy\ -yof V =^4 - syz + sy^ -yv ^^ 
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and generally that — 

r(r- 1) , . 

\'=yt-r,y,^, + —rr- ^^r-. - + to r+i terms - - (a) 

r(r - i) 



and ^^'=y,+,-r,y,+,., + — — ~ •yr+.-.- + tor+i terms - (/3) 



1.2 



the coefficients being those of the binomial expansion. Equations (a) 
and (P) are easily proved by induction. 

We can also express values of y in terms of yo and the differences ; 
jfj =y, + A„i, y^ =;fi -»- Aji =:y^ + 2.A,i + A^2, ^^3 =jf^ + 3. A^i ^ 3V + ^^s^ ^ ^ ^ 
and generally — 

. A 1 . ^i^- i) A « rlr- i)(r- 2) . „ , . 

yr =yo + rAo^ + -^ -'. A„2 + -^ '-^ -^A^3 + . . . + to r + I terms (y) 

AJ = AJ + r.Aj+' + -^^^. AJ*' + '^—l^— -^. AJ+'... + to 7+"i terms (8) 

1.2 1.2.3 ^ ' 



>'r+. =;'s + '-A^ + - I—-' A.2 + - ^- — ^-^ — ^ A,» + ... + tor + I terms (c) 

X.Z X.Z.A 

which formulae can be proved by induction, and in fact (y) is a special 
case both of (8) and (c). 

2. If we assume that J' can be expanded in ascending powers 
of ,r, and is a rational function of the n^^ order, we have — 

y^ao + a^xi-a^^-^- . . . a„^", (f) 

where ao, a^, . . , a^ are constants. 

Suppose now that y„ yi, ^2 • • • >'n are the values of >> which cor- 
respond to values of Xy increasing in arithmetic progression, viz., 
Xo, ^o + ^> Xo + 2^y . . . x^, + n/t; then, on this assumption, we have 
n+i equations to determine the constants in equation {(). 

We can, however, write equation (f) in terms of the ys and ^*s 
without evaluating the constants ; for 

X Xq a \ . X — Xq X — Xq ^ a o ^ ^o ^ "" ^o ^ X Xq 2 /J . 

+ . . . to « + T terms {rf) 

is an equation of the n^ degree in Xy and reduces to the identity (y) 
above, if we substitute (x^, + rA) and y„ where r is an integer not greater 
than fty for x and y. 

Hence equation (rj) is of the same order and satisfied by the same 

« + I pairs of values as equation (f), and is therefore identical with it. 

Again, if ^o, ^'i, . . . y^ are values of y corresponding to any values 
of Xf viz., Xoy Xiy , . , Xj^y it can be shown that the equation 

{x ~ x^) (x - ^2) . _ (x - x„) (x-Xq) {x-x^)...(x-x^) 

^y^ H^ '^i /r " "il ' ' ' fe ^^^ - - Lagrange s formula (Q) 
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is equivalent to ((), for it is of the same degree in x, and is satisfied by 
the same n+i pairs of values of (x, y), viz., {x^, y^ {x^, >'i) • • • (-^ii* A)- 

3. Still assuming that an Equation of the form of (() satisfies 
the conditions, we can at once interpolate any values needed. 

Thus, if we are given that ^„, y^y y^ . . . j^n., are values corresponding 
to ^= I, 2, 3 ... « respectively, we can find y^ where s is fractional, 
by putting x^— i,A = i,^=i+jin equation (1;). We obtain — 

y. -yo + ^ A^ + "nr^- V + s ^ J^y— A* + to « terms (*) 

We can easily obtain similar formulae for any other intervals. 

4. Notice that, if y^y J'l, . . . J'n correspond to values of x 
(x^y x^ + A, &c.) in arithmetic progression — 

and yi'^ao + a^ (-^0 + ^) + . . . + fln(-^o + ^)' from ((j; 

= a^h + ... +a^. nhx^"^ + terms of lower degree in Xy 
an equation of the n - i*** degree only. 

Continuing this process, we obtain — 

^o" — «n- A".«l, and there are no higher differences. 

Also, since (y^t x^ {y^y x^-^h) . . . iyn+n *© + « + i-^) lie in a curve of 
the «"* degree, we have from equation (a) — 

;'n+.-(« + i). yn+- ^ J ■ yn-t-^ f-f '-' A- 

1.2 1*2.3 

+ - to « + 2 terms = A;+'= - - - - W 

5. If for any purpose we need to evaluate the constants in 
equation (f ), we can abbreviate the solution, as follows, if the x^s 
are in arithmetic progression. 

Given five pairs of values, we have — 

y^^a^^a^ (^0 + ^) + (^2 {^o-^^f + a, {x^ + hY + a^ {x^-¥hy 
^^2 = ^o + «i (^o + 2>^) + a^ip^o-^^^Y + a^ix^+iHf + a^(x^-\'2hY 

;fj = tfo + «i (^0+3^) + ^2(^0+3^)* + «8(^o + 3^)* + <»4(*o + 3^)* 
>'4 = flip + ^1 (:ro + 4^) + ^2(^0 + 4^)^ + 03(^0 + 4^)* + ^4(^0 + 4^)* 

A 

As in the last paragraph Ao* = <74.A*.4!, .-. «4 = ^T^ - - " (^) 

It is easily seen that A^^ is independent of Oo, a^ and a^ and that 
V = «8 {('^0 + 3^)' - 3 {x,+ 2hf + 3 (^0 + ^)* - *o*} + 
«4U^o + 3^)*-3(^o+2^)* + 3(^o + ^)*-^o*}=6A»a3 + tf4{24^A:o + 36A*} 

whence «3 = |^-A.*(g, + ^) ; 



ALGEBRAIC TREATMENT. 245 

while ^^^=2h^M^ + a^ (6^2^^+ 6A») + a^ {i2h^x^^-{'2^h^x^+ij^% which 
gives a^, a^ and a^ can then be found from the first two equations. The 
points of inflexion on the curve, j^^flo + ^i-^+^a^ + ^a^ + ^i-^* ^U'e 
determined by the equation — 

o = —4 = 2a« + 6a^ + 1 2a ^p^ 

and the sign of -^4, i.e. of. a^ + ^(^4^^ decides the nature of the change 

of curvature. 

This method is employed on page 254 infra. 

6. In evaluating the constants it will be found that the 
following identities are sometimes useful : — 



«''-*Ci « - I •¥*C^n-2 - + = the coefficient of ^ in the expansion 
of r\ {<?"- 'Cj ^" + '€2 ^" - } 



i.e. oi r\ii^n-sx + - — ^^+). ^(i+- + — +j. 

Coefficient is o^ when r= i, 2, 3 . . . j- i. 

r!, when r = s. 



II 



II 



(« + -]. r!, when r=j+ 1. - - 0*) 



7. It is necessary to express the diflferential coefficients of 
y with regard to x in equation (() in terms of the diflTerences, 
and conversely. Now from a comparison of (() and (17) we 
have, when Xo is zero — 

•'• ^%- i^^-l^o^'^l^^-W^^) +^( ). 

Writing y -f i^x) for equation (f), the equation just written gives, 
when x^x^^Oy 

Applying the same process again and again, and remembering that 
^ bears the same relation to t^ as ^} bears to y^ and so on, we 
obtain, omitting the suffixes — 

where the A's are to be treated as ordinary algebraic quantities, till the 
exponent is removed. 

Thus hj^ (^) = A2 (i - -JA + ^A« - )« 

= A2-A3 + iiA*- 
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It is to be noticed in particular that each derived function 
depends only on differences of as high an order as itself. 

Again, by Taylor's Theorem, 



21 



y, =/K + 2h) =/(^o) + 2hf\x:) + ^^/2(^o) + 

. •. ^o = A - 2y^ +jKo = hV\x:) + >43{ } . 

and, using equations (a) and (/x), or otherwise, generally 

^l = hT{xo)'^h'^'{ } (o) 

8. We may now consider the assumptions made when we 
took (f ) to express the relation between y and x, 

\iy and x are connected by any functional law, that is if y is 
determinate for all given values of or, without which assumption 
interpolation is meaningless, then j/ can be expressed as a function 
of ;i:/ let^=y(;ir), then, by Maclaurin's theorem — 

y=f{x) =f(p) + x,p (p) + "^p (p) + t^p (0) + + 

2 3 

to an infinite number of terms. 

If /""*"'(<?) and following coefficients are very small, and x is 
never large, the terms from the n + 2^^ onwards become negligible 

in comparison with earlier terms, so that the first n+i terms 
determine the value o(y approximately. Now by the equations 
(v) and C^),/""*"* is small when A°+', A"+«, . . . are small, and vice 
versa. Hence we have the following general statement : any 
functional relation between y and x reduces to the parabolic 
equation of the «*** degree (f), if the differences of orders higher 
than the fi^ vanish, and if these differences do not vanish but are 
small, equation (f) is still an approximate expression for the 
relation. 

Now if the line drawn through the given points is to have 
continuous and slowly changing curvature, it is easily verified 
that the second differences for points near together are not large, 
for a rapid change in the rate of increase of the ordinate means 
a rapid change of curvature ; and if we construct a second curve 
with the same abscissae and the first differences as ordinates, 
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small third differences will indicate absence of rapid change in 
the first, and so on ; but beyond this point it is not easy to 
see the connection between the hypothesis underlying inter- 
polation and the diminution of successive differences. The 
converse, however, is clearer ; if in any series of figfures it is 
found experimentally that the successive differences tend to 
disappear, then any curve which passes through the points is 
expressed approximately by the parabolic equation. De Morgan 
states this conclusion thus : — '* If we take n points near each 
other, and having their abscissae in arithmetic progression, with a 
small or at least not very large common difference, and their ordi- 

nates not very unequal . . . the parabola of the n-i^ order will 
very nearly coincide with any regular curve of the same general 
appearance, at least between the same points," Boole's explana- 
tion is ; — " It is customary to assume for the general expression 
of the values under consideration a rational and integral function 
of ;r, and to determine the constants by the given conditions. 
This assumption rests upon the supposition (a supposition, how- 
ever, actually verified in the case of all tabulated functions *) that 
the successive orders of differences rapidly diminish." 

Since, from equation {p\ when h is small, the successive 
differences for any curve diminish as their order becomes higher, 
it is a legitimate process to build up a series of values of any 
function on the hypothesis that the higher differences vanish. 

If a freehand curve is drawn so as to pass through the chosen 
fixed points, and to have curvature which changes as slowly as 
possible, a line will be obtained which lies very near that given 
by equation (f). Such a line would be similar to the track of a 
bicyclist who was riding so as to pass over several marks, or to 
just avoid several obstacles. 

9. It is clear from the above analysis that we can make a 
smooth continuous curve pass through any number of points we 
please ; for with the parabolic equation (f) there are never any 

sudden jumps in the values of j/, ^, or ^ 2» ^^ ^ changes con- 
tinuously ; and we can obtain as many linear equations (which 
have always real values) as there are constants, simply by taking 
n in the original equation to be the number of fixed points. 



jr«j.« 



* That is mathematical functions such zs J e dx^ not statistical 
approximations. 
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If we have, let us say, lo points, as — 



and wish to find a point on a fixed vertical line between ¥ and G, we 
can either take only F and G into consideration, and, joining them 
by a straight line, obtain the point x-^ ; or considering E, F, and G, 
or F, G, and H, draw parabolas and obtain XgOr Xg ; or considering 
E, F, G, and H, draw a parabola of the third order, which would 
have a point of inflexion near F ; this would be approximately the 
path a bicyclist might follow if he had to start from £, and ride 
to a near point H, passing close to F and G. If we now include 
D and K (if our bicyclist has to start from D, pass E, F, 0, and H, 
and reach k) we shall modify the curvature throughout ; and as 
we include more and more points shall continue to affect slightly 
the path F c. If the inclusion of the nearer points tends to 
make the line F G approximate more and more closely to a 
final position, while the further inclusion of the more distant 
points throws it further away, we may conclude that the positions 
of these further points are not governed by the same numerical 
conditions as the nearer one. Thus in a " table of survivals " 
the figures for ages under s years are not distributed in accord- 
ance with the curve determined by the figures for higher ages ; 
in a table showing wages, it may be seen that those of highly 
paid workmen are not governed by the same causes as those 
lower in the scale. On the other hand, the number in each 
census is dependent on all the previous numbers for more than 
one generation. In interpolating for the population of 1876 we 
shall obtain different figures according as we include 1851, '61, 
'71, '81, '91 only, or 1901 as well ; and this is not surprising, for 
a mistake made in 1876 may not come to light till we have 
watched the growth of the population for twenty-five years. It 
is clear that the points far from the period in which the inter- 
polation is to be done cannot be allowed so much influence as 
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those nearer, and it appears experimentally that this condition 
is fulfilled in the method discussed ; also, in series (^) the suc- 
cessive coefficients begin to diminish with the t^ term where 
^<;ro+(2r— 3)A, that is with the coefficient of the first differ- 
ence when X is between x^ and Xo+A. It may be noticed that 
the wanderings of the curve are limited by the condition that a 
curve of the «— 1"* order cannot have more than n — ^ points of 

inflexion, for -^ has no term of a higher degree than x'''^ 

In the above illustration the intermediate points from F to 
G might be found from the five points D, E, F, G, H, or from 
E, F, G, H, K. These two curves may be welded together be- 
tween F and G. The points near F are more accurately deter- 
mined by the first, of which it is the middle ; those near G by the 
second. The welding line should touch the first at F, the second 
at G. This is conveniently done by the use of the sine curve. 
This method is employed, I believe, at the Registrar-General's 
office. 

It cannot be said that the present theory of statistical inter- 
polation rests on an altogether satisfactory basis.* The prin- 
ciples which govern it are not well defined, and the mathematical 
analysis of the methods, by which the principles should be 
brought into relation with the facts, is incomplete. Yet it is 
perhaps unnecessary to labour after more refined methods, for 
interpolation cannot be precise unless we actually know the 
algebraic expression of the laws which govern the figures, and 
the method here discussed is found to satisfy the conditions 
empirically, while further refinements could only introduce slight 
modifications. 

* This remark does not apply to the interpolation in evaluating mathe- 
matical functions. 
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lo. Examples showing the Numerical Use of the 
FORMUL/E. — (i.) Given the number of wage-earners earning 
sums in Ss. groups, to estimate the number earning as much as 
34s. and not so much as 25s. 





•Numbers 
per 1,000 

Earners 

(Adultmalea) 


DIFFEBBNCKS. 


,.t. 


2Q<1. 


yd. 


4.h. 


Earning as much ^ 35s. 
as I OS. ;23os. 

V^ 40s. 


296 
599 
804 
918 
966 


£57 
3"3 

114 

48 


46 
-98 
-91 

-66 


-144 
7 
'5 


18 



Neglect the increasing difierer 
less than 15s. 

Using formula (ij), jr„=2o (shillings), 

At 2$s.,y = sg^, from above table. 



arising from the number earning 
^ = 5. >.= 296. 'i=' = 3o3' 



At 24s., X''24, ji'- 



eof7+5-- 



-6 - 



— of (-98) + 
I of 18. 



-296 + 142.4+ 7. 84 +-.2 24 + .31 68 = 547 (nearly). 

The required number is therefore S99 ~ 547 = 5 ^■ 

Again at 23s., j: = jiro+3, _j' = 489, and the number earning as much 
as 23s. and not so much as 24s. is 58. 

(2.') To make an estimate for the value of imports in the year 



is for which were destroyed by fire 




of imports in — 










^39,202,000 






:yv 






26,510,000 






y^- 






26,163,000 






J'z- 

y*- 






33,755.000 






yy 






32,987,000 






y^- 


- 27,431,000 - 




y-- 




•Ge 


Dcral Report on Waj 


jes. 
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From formulae (k), using ^'g and j'g only — 

A - 2^4 + ys= Oyy^= 29,959. 
From formulae (k), using j^g ^^^J^e ^^ ^^^^ — 

y^-^ y2 - 4(a+j'8) + ^y4 = o,y^= 30,029, 

From formulae (k), using _>'i and ^7 as well — * 

j'r + >'i - 6 (je + >'2) + '5 (y^+yz) - 20 ^'4 = ^yy^ = 30.421. 

Here the first and second values are very near together, 
while the third differs ; hence we adopt ;^30,ooo,ooo as the value 
required. (Cf, s, similar example in Boole's chapter.) 

(3.) In Mr Booth's Life and Labour of the People^ e,g.y Vol. 
v., p. 46, a series of very useful diagrams is given showing the 
age distribution of various classes. The figures he uses are as 
follows : — 

Proportion Average at 

occupied per each year of 

10,000 of total age between 

Ages. aged io-8o. given limits. 

10-15 years ... - 193.5 38.7 

15-20 „ - _ - - S80 176 

20-25 » - - - - 933 188.6 

25-35 » .... 1,636 163.6 

35-45 i» - - - . - i|2oi 120.1 

45-55 »» .... 830 83 

55-65 >» - - - - 434 43.4 

65-85 „ .... 192.5 12.8 

His diagram is drawn from the last column, the numbers in 
which form the ordinates for the middle of the corresponding 
age periods. The points so obtained are joined by straight lines. 
This method is sufficiently accurate for his purpose, but it will 
afford an interesting example of interpolation if we obtain some 
of the figures for intermediate years more closely. 

Numbers up to 
corresponding limits. 

yi = 193-5 

y^ = 1073-5 

yz = 2006.5 

y^ = 3642.5 

A = 4843-5 

.Tc = 5673-5 

- y^ = 6107.5 
fg « 72J . - - - - ^8 = 6300 



Mean 


Age. 


*1 = 


I2J 


Xj - 


17* 


*3 = 


22J 


*4 = 


30 


*6 = 


40 


*« = 


50 


Xj = 


60 
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Since the ^s are not in arithmetic progression, we must 
use Lagrange's formula (6). 

To find the ordinate corresponding to the age 35, for example, 
we will include the five values of ^ from j'g toj'^ 

Then y-^^3^S^i,s){-i2i)i-22i){^32i)'''^'^''s.(.'7i)i'i7i){'27i) 

_,,^^,, i7ii2K-5) (-15) ^,0,,,, I7t(i2i)(5)(-i5) 
+ 3042.5 X ,2j.7j. ( - 10) ( - 20) + '^^^^^ 5 X jj2j (jy^j jQ^ ^ _ ,Qj 

+ 5673.5 X 32j.27j.2a 10 
=44«. 

Mr Booth's method gives 4243 for the same position. 

(4.) We can now determine the median and the mode more 
accurately than before. We will use the figures already em- 
ployed in Chapter IV., which may be retabulated thus : — 



Earning 
more than 


Numbers. 


Differences. 


Correspond- 
ing Abscissae. 


$4.75 


9 








19 


4.25 


13 










17 


3.75 


109 










15 


3-25 


363 










13 


2.75 


561 


506 


464 

327 

175 

-"55 






II 


2.25 

1-75 
1-25 

•75 


1,067 

2,037 

3,334 
4,806 


970 

1,297 
1,472 

317 


-137 

-152 

-1330 


-15 
-1178 


9 
7 

5 
3 


•25 


5,123 








I 



Unit of abscissa, $.25. 

To find the median use the five points whose abscissae are 11, 

7, 5, 3- 
Equation (f) gives — 

561 = a^, + aj.ii + tf2.11* + a,.ii' + a4.11* 
1067 = flo + «r9 + tf2-9* + «8-9' + «4-9*- 
2037 = tf o -H tfi.7 + ^rf + ^8-7* + «4-7* 
3334 = tf • + tfi.5 + «8-5* ^ «8-5' + «4-S* 
4806 = flo + «i.3 + «j-3* + «8-3* + «4'3* 
Using equations (X), we have — 

''• = 6^8 + ^s(6^g~8).sincex.= ixandA;= -137. 
_ 197 

- w 
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464 « 8^2 + ^^ (24 X 11-6x8) — 5^(48x ii^-24X 8 X 11 + 14X 16) 
48 128 

«2 - - 33^1 • 

Equations (/a) could have been used with advantage if the difference 
between successfve abscissae had been unity. 
«! is found from the equation — 

- 1472 = 2^1 + i6a2 + 98^3 + 544^4 

and finally a^ = 697 2^*^. 

The median is then found from the equation — 

2561J = tfo + «i^ + ^2^ + ^8^ + ^^ 
Solving by Homer's method, we find x =» 6.142; and, therefore, the 
median is at $1,536. 

Second method : — 

Suppose ;r expressed as a function of^* and apply Lagrange's 
formula {$) suitably altered. 

h}5^^i - 3334) (g.S64 - 2037) (2561^ - 1067) (2561I - 5<^i) 
3- (4go5 ^ 3334) (4go6 _ 2037) (4806 - 1067) (4806 - 561) 

(,56ii-48o6)( )( ) ^„^^ 

(3334 - 4806) ( ) ( ) 

whence x » 6.237 ; that is the median is at $1.56. 

This method saves the solution of a biquadratic, and with 
small numbers would need less numerical work than the first 
method. 

Third method : — 

Use formula (^) to obtain the necessary equation. 

Thus 2s6iJ=>^ = S6i+^^ of 5064. (£z2lij(£i:9) of 464 

^(^^I,)(X-9)(^-7)of(, y) 

-2X-4X-6 

+ (■y-ii)(^-9)('y-7)('y~S) pf/ t;) 
-2x -4X -6x -8 

This reduces to the same equation as in the first method — 

gi 5 43 - I97:i:' ^X* 

2s6ii = 6972^-657^^-335^^+-^ "TTs 

The quartiles, deciles, and percentiles can be found by similar 
methods. 

m 

♦ Compare Edgcworth's RepresmtaHon of Statistics by Mathematical 
Formula^ Statistical Journal, 1898, p. 699. 
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For the mode we must take in the last number, 5123, and 
recalculate a^^ a^, a^ for the five highest values oi y^ and then 
solve the quadratic given in paragraph 5, viz. — 

giving the constants their new values. 

Hence ^ = 8.2 or 4.40: — ^ is positive and .'. \ -^^ a maximum, 

d:xr ax 

when Jic = 4.40. The mode is then at $t.io. 

The mode can of course be determined less accurately by 
taking 4 or 3 given points instead of 5, or for greater accuracy 
more can sometimes be used. 

Another mode may be found between $2.75 and $3.75 from 
the five highest abscissae. This proves to be at $3.20. 

This method is applicable to such problems as the determina- 
tion of the date at which the population, the marriage, birth, and 
death rates, &c., increased most rapidly ; at what age the chance 
of death increases most, &c.* 

B. The second division of the problem of interpolation is 
when the original returns have to be corrected, e^,y the deter- 
mination of the distribution by age from the census returns. 

We have now the problem of drawing a smooth line in the 
neighbourhood of a great number of points, but not necessarily 
through any of them. The assumption is that the returns are 
insufficient in number or deficient in accuracy, and that they 
indicate a regular distribution which it is required to represent 

1. One method is to assume that the averages over fairly 
large groups are accurate, and to these averages to apply any of 
the methods discussed under group A. 

2. A second method has been used in the section in which 
various curves were smoothed (vide supra, Chapter VII.). This 
may be restated as follows : — Take successive groups of 2, or 3, 
or 4 .... 10 points, beginning again and again at the ordinates 
for each of the given abscissae. Find the centres of gravity of 
each group ; that is, erect an ordinate equal to the average 
of the ordinates of a group at the point half-way between the 
ends of the abscissae of the outside ordinates of the group. 
Draw a line through the points so obtained. It will be found 
that this line satisfies all the conditions laid down. An 
example of this method is given in the diagram facing p. 151. 

* Cf, Edgeworth, in Statistical Journal, 1899, p. 381, and the references 
there given. 
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3. In another method* the original figures are smoothed till 
the differences of the fourth or fifth or higher orders vanish; and 
then the ordinary formulae of interpolation are applied. 

Thus in example i, on page 250, rewrite the table thus : — 



Wages. 


Smoothed 
Numbers. 


Corrected Diflferences. 


Up to 20s. 

i» 2SS. 
» 30s. 

» 35s. 
„ 40s. 


296 

599 + « 
8o4+a + ^ 

918 

966 


1st. 
303 + fl 
205+^ 
114-a-^ 
48 


2nd. 

-98-^ + ^ 
-91-0-2^ 
-66 + tf + ^ 


3r<i. 

7-3^ 

25 + 2 tf + 3^ 



If we put ^= 2J, fl= - 16, the third differences vanish, and we have 
AJ = 287, AJ=-79§, A» = AJ = <7; when ^=25, ^' = 583, and when 

^=24, ;/ = 296 + |of 287- Aof(-79l) = 532; 
so that the number earning as much as 24s. and not so much as 
25s. is now found to be 51, instead of 52. 

The corrections may be applied to any of the original figures. 

We need to solve only one more equation to complete our table 
from 20s. to 30s. 

When X = 23, j' = 296 + f of 287 + ^^of 79f. The difference be- 
tween this and the value of^, when x = 24, is | of 287 - -^j of 79I = 54.2. 

We have therefore the following table, where the figures in 
italics have already been calculated, while the others are added 
on the assumption that the third differences are zero. 



Wages. 


Numbers. 


Differences. 


Up 


to 20s. 

» 2 IS. . 
»> 22s. 
., 23s. 
„ 24s. 
.. 25s. 
„ 26s. 
„ 27s. 
„ 28s. 
,, 29s. 

n SOS- 


sg6 
360 
420 
478 
532 
S^S 
631 
677 
719 

757 
792 


ISt. 

63.8 
60.6 

57.4 

S4'^ 

SI 

47.8 

44.6 

41.4 
38.2 

35 


2nd. 

• • • 

3.2 
3.2 
3.2 
3-2 
3.2 
3.2 
3.2 
3.2 
3.2 

• • • 


3rd. 

• • • 










■ ■ • 



If we had taken the second differences more exactly, we 



* Suggested to me by Mr W. F. Sheppard. 
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should have obtained 804 + a + 6 = 790^ for the last figrure 
as in the previous table. 

This method of writing down many figures when the signifi- 
cant differences have been found can be very generally applied 
in Group A as well as here. 

4. Another method, involving higher mathematics, would be 
discussed more suitably after the section devoted to the law of 
error ; a brief explanation with a useful formula may, however, 
be offered here. 

Suppose we have five consecutive points ( — 2,j'j), (— l.^i). 
(<'.^). (>.>.). (2. r.) given. 

A parabola of the fourth order could be drawn through these 
five points, but would have two points of inflexion. A great 
number of parabolas of the third order can be drawn near all 
the points, having no points of inflexion, and satisfying all the 
ordinary conditions of interpolation. 

Borrowing a principle from the method of least squares,* if 
the coefficients of the parabola > = ii+A-t:+(^+^ are chosen 
so as to make the quantity 

(where the summation extends over the five pairs of values of 
X and y) a minimum, the parabola so determined will be the 
best for the purpose. 

For the necessary mathematical analysis. Professor Darwin's 
paper On FallibU Measures,^ from which this method is taken, 
should be consulted. 

The following equation is obtained — 
a=y— ^ X lii where ^ is the difference of the fourth order for 
the y's. 

Now replace the point (o, y) by the intersection of its 
ordinate with the parabola, that is by {o, a), where a has the 
-' - just given, that is, diminish^ by the quantity ^V^ 

;peat the same process for each point on the original line, 
[ it as the middle of a group of 5, and a smooth curve 
very near all the original points is obtained. 



* See Meaimsui's AfetAod 0/ Ltas/ Sftiara, Chap. III. 
t Sec PAH. Mag. and Journal, July 1877. 
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Thus we may smooth line C in diagram facing p. 164. 



Imported 

Wheat per head 

of the 

Population. 



1890 
1891 
1892 

1893 
1894 

1895 
1896 

1897 

1898 



lbs. 

226 

244 

245 
248 

256 

285 

257 
228 

238 





Differences. 




18 


- 17 






I 


2 


19 


-16 


3 
8 


5 


3 
16 


13 


29 
-28 


21 

-57 
- I 


-78 
56 


-94 

134 
-16 


-29 

+ 10 


39 


40 


I 



Smoothed Figures. 



245+ A of 16: 
248-^ of 13 = 

256 + A of 94 = 

285 -A of 134= 

257+ A of 16' 



246^ 

247 
264 

263J 
258J 



5. In many series of observations it is found that the num- 
bers very nearly satisfy some algebraic formulae,* such as the 
binomial expansion, the geometric progression, the law of error, 
or some specially chosen expression. In such a case the con- 
stants of the equation chosen are computed by methods similar 
to that of the last paragraph, and the original observations are 
replaced by the ordinates of the curve thus determined. Prof. 
Pareto has found an equation which fits the data of the distribu- 
tion of incomes.f Modern mathematical statistics deals very 
frequently in such formulae. Here we will briefly describe one 
which has very practical utility, namely, Makeham's formula 
for the life table. I If /, is the number who survive to the age x 

out of a given generation, then the formula lj,=ks^ {gy , where 
k, s, gy c are selected constants, fits the records from the ages of 
20 upwards with such exactness, that the formula is used for 
practical actuarial calculations. The formula is not quite arbi- 
trary, but can be obtained from the hypotheses described in the 
following paragraph. 

Let the quantity I '^^, = — j^ — it^dx. Then fi, is called the 

" force of mortality," and represents the ratio of the number of persons 
dying in a short interval to the total number alive at the beginning 
of the interval. 

♦ See Edgeworth, ibid,t p. 671. 

t See Statistical Journal^ 1896, p. 533. 



+ 
+ 



See Institute of Actuaries Text-Booky Part II. 

R 
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This force is supposed to consist of two parts, one constant and = A ; 
and the other such that the ratio of the increase of the force to the force 
is constant, that is, that the force continually increases in a geometric 
progression. For the latter part (/x',) 

logfi\ = Dx + E 

fi\ = e^-^^ = B ^^, where B = ^, and log ,r = D. 

Then /x, = A + B^. 

This equation represents the hypothesis that the chance of death 
consists of one part which is constant for all ages, and another which 
is due to the power of resisting death diminishing continuously with 
aige in a constant ratio. 

We have - -p- « [jLj/Ix = A + Bc^dx 

^x 

- log /, = Ax + k^c* + k^ where k^^ k^ are constants. 

4 = k.5^. (gY , where - A = log .^ 

- ^1 = log e^ 

- k^ = log ^ 

For further information on the subject of interpolation, the reader is referred 
to Dr Parr's U/e Table (No. 3), 1864, Boole's Finite Differences^ Text-Book of 
Institute of Actuaries^ Part II., p. 420 seq.y Rice's Theory and Practice of Inter- 
polation^ 1899, Merrifield On Quadratures and Interpolation (British Associa- 
tion Report, 1880), Chauvenet's Sfiherical and PracticcU Astronomy {Oxsl^. II.), 
Woodhouse in the Assurance Afagazine (Vols. XL, XIL), Professor J. D. 
Everett's Papers (published or forthcoming) On the Algebra of Difference 
Tables (Quarterly Journal of Mathematics, No. 124, 1900), On a Central- 
difference Interpolation Formula (British Association Report, 1900), and in the 
Journal of the Institute of Actuaries, January 1901, and Mr W. F. Sheppard's 
Papers On Central Difference Formulce (Proceedings of the London Mathe- 
matical Society, Vol. XXXI., Nos. 707-710), and On the Use of Auxiliary 
Curves in StcUistics of Continuous Variation (Statistical Journal, September 
1900). In these other references will be found. Part of the foregoing 
chapter might be simplified by the use of " central differences," but in so 
short an introduction to the subject it seemed best to keep to the more 
familiar method. 
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PART II. 

APPLICATION OF THE THEORY OF PROBABILITY 

TO STATISTICS. 



SECTION 1. 

Introductory. 

The arguments on which the theory of algebraic probability 

depends are not difficult to follow, and are in fact grounded 

Object on every -day experience ; the development of 

of Part u. calculations also is often little more than straight- 
forward arithmetic ; and without using any elaborate mathe- 
matical theories we can examine the nature and deduce the 
equation of the curve of error, which, though it is the foundation 
of modern mathematical statistics, is only a reasonable summary 
of common experience. 

It is not proposed here to go beyond the more elementary 
and common applications of the law of error ; the more 
advanced treatment tends to deal more with theory and less 
with practical applications, and is most suitably studied in 
the original treatises scattered through the journals of various 
learned societies. The present object is to endeavour to make 
clear the groundwork of the subject, so that it will be the 
easier for students to follow modern writers on statistics ; the 
mathematicians who are opening up new ground in this direc- 
tion naturally cannot stop in each article they write to establish 
the elementary theorems which are already common property, 
and so it is often not easy for readers, unfamiliar with these 
elements, to find any satisfactory discussion or proof of the 
preliminary formulae or theorems, since they are not contained 
in any text-book devoted to the subject It is this lack of 
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a preliminary text -book that it is wished to supply in this 
and the following sections. 

The treatment is not intended to be original, and is, it is 
hoped, not inconsistent with Professor Edgeworth's publishefl 
treatises,* since the greater part of the mathematics employed 
is gleaned from his essays, and the earlier authorities to whom 
he makes reference. The exact form of the proofs employed, 
and the particular ways in which the formulae are used, are not 
in all cases to be found elsewhere ; and any fault which may be 
found with the arguments or application of formulae must attach 
to the present writer. To avoid mere repetition of what is better 
said elsewhere, and not to cumber the ground with well-known 
elementary formulae, the reader is assumed to be acquainted 
with Dr Venn's Logic of Chance^ and Whitworth's Choice and 
Chance^ or with the chapter on Probability to be found in 
ordinary school algebras. It is hoped that the following pages, 
however, will be for the most part independent of proofs or 
formulae of which the explanation is not furnished. 

To the statistician of a generation ago, to the so-called 
practical men of the present day, and perhaps to some political 

HMd for economists, it would seem absurd and unnecessary 
apidioatioB of to apply these tedious arguments and complicated 
uiwry fQf j^yjgg ^Q ^j^g study of mere figures, which at 
first sight appear subject to the ordinary rules of arithmetic ; 
but it will be found as we proceed that we are able by their use 
to solve problems and investigate causal relations which, though 
apparently simple, must entirely baffle direct attempts to obtain 

aziMi from ^" ^^^y solution. The necessity of some application 
the doflnitioB of the rules of probability becomes evident from 
the very definition of the science of statistics.f 
Statistics deals with great numbers, the numbers of the items 
which compose some part of the economic or social body as 
a whole. It does not deal with a single homogeneous mass 
but with a complex body composed of multitudinous units 
differing in form and action one from the other; and it is 
with the complex not with the units that it is concerned. 
Just as in the mechanics of rigid bodies it is necessary to 

* See IdL and Phil, Magazine^ passim; Statistical Journal^ passim; 
Report of British Association on Methods of Ascertaining Variations in the 
Monetary Standard^ 1888, and others. 

t See supray p. 7. 



^*^ 
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make some hypothesis as to the laws which hold their con- 
stituent molecules in place, before any general problems relating 
to their motion as a whole can be attacked, and in the kinetic 
theory of gases a generalized theorem of the motion of the 
separate molecules is employed, so in statistics we must obtain 
some generalizing principle as to the relation of unit to unit 
before we can study the phenomena manifested by the body. 
The economist and the politician, when investigating the 
effect of a given force, are as a rule concerned with its effect 
on the whole mass, not on the individuals in particular.* 
For illustration, we may take one of the numerical totals, relating 
to a nation, that remains nearly stationary year by year ; say 
the number of marriages yearly in a population of ten millions. 
It is on the total that we trace the effects of a change in 
our marriage laws. If we regarded only a single family, or a 
village or small town, we should not find any constancy ; the 
marriage rate would be changing continually with the personnel 
and age of the small community, and we could not trace with 
certainty the effect of any external cause. But add family to 
family, village to village, and district to district ; the individual 
peculiarities of the parts are rapidly lost in the total ; in a 
large community the same number are of marriageable age 
year by year, the same distribution by age and sex recurs 
continuously ; if undisturbed by external influences, the same 
marriage rate will be found over a long period. Each couple 
is influenced by many circumstances before finally deciding to 
marry; there are very many causes, each of limited effect, 
which influence the question in different localities, such as an 
exodus of young men from one district, commercial depres- 
sion in another, a new demand for labour in a third ; but 
when many districts are taken together these small disturbances 
counterbalance one another. To produce a change in the 
rate, the action of a cause is necessary which affects many 
districts in the same way. Here is to be found the assumption 
that underlies all statistical investigation, viz., that many inde- 
pendent disturbing causes of small individual effect neutralise 
one another in the mass, f 

It is a matter of common experience that great numbers 

* See Miirs Logic^ Book III., Chap. 23, and Book VI., Chap. 3. 
t Compare the title of Lexis* treatise, viz., Zur Theorie der Massen- 
ersckeinungen in der menschlichen Gesellschaft. 
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and averages drawn from them are nearly stationary. By 
n« oonunon pro- searching for the common properties contained in 
par^ofgnM these numbers, we shall find the clue to this con- 
stancy. The following are among the numbers 
which do not undei^o rapid change : birth, marri<^e, and death 
rates in districts of, say over a million inhabitants ; death rates 
according to age or disease over larger areas ; the numbers 
of the inhabitants of a great kingdom, even when subdivided 
by age and sex, the numbers of paupers, criminals, lunatics, 
afflicted ; the consumption of certain commodities, the total 
income, the average wage, and total imports and exports 
(though here the constancy is not so apparent). These are 
all totals of many small items, the existence of each of which is 
determined independently and apparently by chance. Another 
class is to be found in meteorological measurements, such as 
annual rainfall, mean temperature, and mean barometric height, 
where the average or total is again drawn from the combination 
of many small independent variations or contributions. An allied 
class is found in such physical measurements as average height 
and weight. 

It is not so easy to exemplify large numbers which are not 

constant The total revenue which varies with each change 

in impost is an example. .The number of a conscript army 

changes with the law controlling it. the number of volunteers 

with improved conditions of service, the area of the British 

ith each territorial extension, the volume of trade 

nmercial infiation, the death rate with an epidemic. 

are changes where one cause has influenced many 

once in the same direction ; but even here the 

constancy arising from the multitudinous small 

It causes is apparent 

instancy, marvellous as it actually is, is generally 
s a matter of course ; and it is not the regularity 
M but the occasional deflections which are the sub- 
ject of comment. For instance, the death rate in 
ill hardly change, except regularly with the seasons, 
sek through a series of years ; and when an increase 
,ocxD occurs in some week, the newspapers write of 
a epidemic. The mean annual rainfall will for a long 
near its average ; then a decrease of 5 inches excites 
1 a permanent change of climate. It is because this 
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regularity has become a matter of common experience that so 
little attention is generally given to it. A cursory inspection, 
however, of the records for a period of weeks or years of any of 
these numbers will show that the constancy is not absolute ; 
that each rate varies through a great or small percentage, and, 
except that the variation seldom passes certain limits, without 
any apparent law. Thence at once rises the question, how are 
we to determine whether a given deviation is due to some 
general cause, such as an epidemic, a change of climate, or a 
new law, or is natural to the phenomena ? 

This question can only be answered by an appeal to the 
laws of probability. To take a numerical instance : suppose we 
niiutrauonby ^^^ dealing with 1,000 men, each fifty years old, 
the binomial how many should we expect to die in the year? 
**'*"'^**°* Fall back on former experience, and find what 
has been the average death rate under similar circumstances ; 
this rate gives the number to be expected d priori, a great 
divergence appears from past experience to be improbable, 
and the greater the divergence the greater the improbability ; 
an exact repetition of the average itself appears to be im- 
probable ; the question is, what divergence is to be expected ? 
This is insoluble directly, but we can frame a hypothesis which 
throws light on the problem. Suppose the ascertained death 
rate to be 50 (per 1,000), and further suppose that the chance of 

death for each individual is -5^=-. Then it is easily deter- 

1000 20 ^ 

mined by the rules of algebraic probability that the successive 
terms in the expansion by the binomial theorem of (— + — ) ^°^ 
represent respectively the chances that exactly o, i, 2, 3 , . . 
of the persons die ; Q? j ^°°° is the chance that none die, 

( 20 ) ( is ) is the chance that one assigned individual only dies, 

/19\ 999 / 1 \ 

1000 X ( 20 1 ( ^ ) that only one unassigned individual dies, 

and so on. The death of exactly 50 is more probable than any 
other number, 49 very nearly as probable, 5 1 next It is very 
soon apparent when the successive terms are calculated, that 
any great divergence from 50 is very improbable. 

This conception, that all the men start with the same chance 
of death, or, in a more developed form, that their chances of 
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death are grouped about an average — , satisfies the i priori 

ProoeMjnituied conditions of the problem, and clearly leads to 
t^itarBraita. results which correspond roughly at any rate 
with experience; but the justice of the conception cannot be 
deduced d priori^ for it is universally the case with any 
hypothesis as to probability, that conformity with experience 
is the only justification for the hypothesis. If it is true, we 
should find that when the records of many such generations of 
I, GOO men were examined, the divergences from the average 
were grouped in the way shown by the algebraic calculation. 
The records for this particular examination are not extant ; 
but in the sequel some records will be given where experience 
marches with theory, and references will be given to books where 
others may be found ; though it may be said at once that the 
agreement is not perfect, and that there are indications that the 
law is not so simple as that already suggested. 

Consider the supposition that the chance of a death within a 

year is — . When we say that the chance of an event is — , we 

TiMmeaiiiBgof ^^^^^ ^^^^ ^^ ^^^ circumstances connected with it 
animartoai recurred again and again, the event would occur on 
an average once for each twenty such recurrences.* 
Thus if a die with six regular faces is thrown again and again, 
the different faces tend to come uppermost with equal frequency. 
As a matter of fact, each of the six would probably not be found 
once in each six throws, nor exactly two of each in twelve throws ; 
but, in the long run, it is a matter of experience that the numbers 
of times each of the six faces come uppermost tend to be equal. 
Suppose an experiment, for the success of which the chance is 

— , to be performed again and again. In 200 attempts from 8 

to 12 successes may be obtained; in 2,000 the proportion of 

successes to attempts will probably be nearer — , say 94 to 106 

successes; in 2,000,000 yet nearer. Now suppose 1,000 experi- 
ments to be made ; as we have seen, exactly 50 successes are 
not to be expected: but let 1,000 after 1,000 be tried; some- 
times more, sometimes less than 50 successes will be obtained ; 
and as the series continues the general average will tend nearer 
and nearer to 50. 

* Logic of Chance^ 3rd Edition, pp. 4, 5. 
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Still postponing the examination of the exact grouping of 

such numbers about their average, let us examine further the 

Relation to law nature of the argument. Suppose we are given a 

of orror. series of large numbers or rates, measuring similar 
quantities year after year ; we shall find, when they are grouped 
according to their distance from their average, that the fur- 
ther from the average the fewer are the instances. In most 
cases we cannot work backwards to a number of individuals, 
each of whom has an equal chance of furnishing an event, but 
we can examine this grouping, notice how far the numbers are 
from their average, and so on ; in many cases we shall find that 
these divergences conform to a definite law, the law of error, 
which is obeyed by all great numbers coming from series of 
experiments as just described. The point 'to notice specially 
here is, that correspondence to this regular law of divergence is 
natural, and it is for discrepancies that we need seek a reason. 
It is improbable, it is impossible, that great nufnbers should 
remain absolutely constant ; from the nature of the case there 
must be variation ; in very many cases the natural variation, the 
variation to be expected d prion, is that in accordance with the 
law of error. This is so with those great numbers which are the 
sum of very many items, in favour of the existence of each of 
which there is a definite chance, or, more generally, the existence 
of each of which may be influenced by many independent causes 
each of limited effect. 

A slight confusion may arise from the use of the words 
cause and chance in this statement ; this can be removed by 

oavieand eliminating the word chance. We say a thing 

***""•* happens by chance, when its occurrence is influ- 
enced by many independent causes whose separate effects 
we cannot trace, as when we draw a card from a thoroughly 
shuffled pack. Now if we consider a man's death from 
the point of view of an insurance office, we regard the 
man as of normal health and constitution, and liable to all 
the latent diseases, the accidents, and the epidemics, from 
which experience shows men suffer ; we cannot trace the inci- 
pient development of a disease, nor foretell the chain of events 
which lead to an accident. We then speak simply of the 
chance of death within a certain period, and say experience 
shows it to be (^.^.) —1 and, regarding the peculiarities of a 
particular man as unknown, we say that his chance of death is 
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— . Generalizing, any group of men, each of the given age and 

in the given circumstances, is composed of individuals for each 

of whom the chance of death is — . Now, go behind the idea of 

chance to that of cause. Each death is the result of some 
particular event, or, to speak more correctly, is due to the action 
of a complex of many causes ; all these untraceable causes pro- 
duce on an average one death among 20 living ; the statement 
of the numerical chance is merely the summary of these effects. 
To say, then, that the number of deaths to be expected among 
1,000 is the same as the number of successes to be expected in 

1,000 attempts, the chance of success in each of which is -, is 

not inconsistent with saying that the number of deaths is deter- 
mined by the action of a multitude of causes none of which by 
itself produces a great effect In either case the laws of great 
numbers will be found to apply. The use of the intermediate 
numerical chance only facilitates calculation. 

Now suppose that a new cause is suddenly introduced, or the 
action of one of the causes is intensified (say, by an epidemic), 

viR^ «# - and at once the whole scheme of calculation is 

Auoot Ox a 

pradominant thrown out, and we get a result which does not 
^*"' correspond to the probability calculation ; it is this 
non-correspondence which indicates the existence of a disturbing 
cause. 

Since the distribution in accordance with the curve of error is 
the result which may be expected ^/rr*(7/7', whenever we are deal- 
ing with numbers generated in this way, it is clearly necessary to 
study this distribution before we can base any arguments on the 
variation of great numbers. When we have established the result 
which the independent action of a very great number of individually 
unimportant causes can produce, then, and not till then, we are in 
a position to consider the effect of a predominant cause. We 
may even be able to deduce the existence of such a cause, for if 
we find by examination that a divergence of more than 3 per 
cent, from an average is improbable, and in a particular case we 
have a divergence of 30 per cent., we are either in the presence 
of a very improbable event, or some external predominant cause 
has influenced our numbers. 
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Section II. — The Equation of the Curve of Error. 

In this section it is proposed to develop the algebraic equa- 
tion and properties of the curve of error, bringing them into 

mo laoiion oan relation with the other sections. It will be not 
be omitted, impossible for non-mathematical readers to follow 
the great part of the argument of this branch of statistics with- 
out working through the mathematical proofs of the formulae ; 
and the book is so arranged that this section can be omitted. 
Other readers may turn this chapter through looking at the 

imt not without large type only, and notice the main lines of the 
^**** argument For any thorough student of statistics, 

however, the mathematical proofs, which are so simplified in this 
chapter as not to involve the integral calculus at all, and the 
differential calculus only for two small points, are essential. In 
this section an acquaintance with algebra up to and including 
the exponential and logarithmic series is assumed. Starting 
from that point, the main formulx relating to the curve of error 
are deduced. 

Elementary Theorems in Probability. 



Definition. — If an event can happen in m ways, and fail m n-m 



m 



ways, and all these ways are equally likely to occur, then — is the proba- 
bility of its occurrence. 

Let — = /, and = q: then/ + ^ = i. ^ is the chance that 

n n 

event will not occur. The odds in favour are/ to ^, those against are 

g top, 

E,g,^ the chance that a card, drawn at random from a full pack, is a 

spade = 13 = I. 
52 4 
Theorem. — If p^p^ are the chances of two independent events, then 

P\ ^P% is the chance that both will occur. 

Suppose that /i = — jA " ~' 

The first event may be expected tn^ times in n^ trials, or m^n^ times in 
n^n,2 trials. 
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The second event may be expected m^ times in n^ trials, or Mim^ 
times in m^n^ trials. 

Hence the second event will occur at same time as the first m^m^ times 

#00 0M 

in «i«2 tri^ ; ^hat is, the chance of the double event is ^ ^ = /i A* 
Examples. — Independent events, — ^The chance that two sixes will be 

thrown with a pair of dice == :t x - = — -. 

6 6 36 

Dependent events, — The chance that three cards taken in succession 
from the same' pack shall prove to be ace, king, and queen of the same suit 

in any order is — x — x — = : for the chance that the first card 

52 51 50 5525 

12 
drawn is an ace, king, or queen is — ; supposing it to be queen, 

52 

.2 
the chance that ace or king of same suit follows is — : and the chance 

that the third draw gives the remaining card is — . 

llie chance that 13 cards, taken at random from a complete pack will 

isr; y^ 18(3 
contain 8 spades and 5 clubs is — L^; 5;* for 8 spades can be chosen 

in ^'Cg ways, 5 clubs in ^'Cg ways, and the hand may contain any such 
group of spades with any such group of clubs ; hence the numerator 
given corresponds to m of the definition given above; also there are 
^^^18 ^ually likely possible hands of 13 cards, so that the denominator 
given corresponds to «. 

Theorem, — If n coins are placed at random on a table, the chance 
that r will show heads and the rest tails is — -'. 

2" 

For suppose there are n places to be filled each with a coin 1—7 
The first may show head or tail, two ways. 
The secoitd may show head or tail, two ways. 
The first two places may therefore be filled in 2 x 2 ways. 
The n places may similarly be filled in 2" ways. 
Now r of these places can be chosen in "C, ways ; and to each such 
selection corresponds one arrangement in which these r places are filled 



n \ 

♦ °Cj or „C, is written for , the numbers of combinations of n 

n - r\ r\ 

n\ 
things r at a time ; and "P, or "P, is written for =zz=r , the number of Per- 

n — r\ 

mutations of n things r at time. 
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with heads and the rest with tails (and many other arrangements not 
giving this result). 

Hence out of 2" possible arrangements, "Cr give the result. 

Aliter. — Consider the product (^j + /j) (^2+ ^2) • • • (^ + ^^' 
Any term of this product, e,g,j h^ /g ^g ^4 ^5 • . • ^n-i ^o corresponds 
to one arrangement of the n coins. 

The number of arrangements containing r heads and n-r tails is 
the same as the number of terms containing r ^'s and n-r fs. 

This number is the same as the coefficient of h^ /°~' is the expansion 
by the binomial theorem of (h + /)"*, which is obtained from the product 
above by writing h for ^j, h^ &c., and / for /j, t^ &c. 

(A + /)" = A" + "Q. A»-' / + ... + »C, //"-'/'+ ...+ r. 

Hence the number of arrangements producing the required result 
is "C,. The total number of possible arrangements is the sum of the co- 
efficients in this expansion ; this is found by putting A = /- i. 

(i + i)°=H-"Ci +...+ "C, +...+ I. 
Hence total number = 2". 

Notice that "Q = "C„.^ 

Example. — The coefficients in the expansion of (i + i)^^ are as 
follows : — 

Corresponding chance * .0000000002 
„ .0000000074 

„ .0000001155 

„ .000001155 



82p — 82p 

^0 "" ^82 


= 


I. 


82p — 82p 


— 


32. 


82p — .82P 
^2 "* ^80 


= 


496. 


»»C, = »2Cj9 


= 


4,960. 


82p — 82p 
^4 " ^28 


= 


35,960. 


'^Q = *'C27 


= 


201,376. 


82p _ 82p 
^6 "" ^28 


= 


906,192. 


»2C, = ««Cj8 


= 


3,365,856. 


32p — 82p 
^^8 "" ^24 


= 


10,518,300. 


82p — 82p 
^9 *~ ^28 


= 


28,048,800. 


82p — 82P 
'^lO -* ^^22 


= 


64,512,240. 


S2p — 82p 
^11 "■ ^21 


s 


129,024,480. 


32p — 82p 
^12 ■" ^20 


= 


225,792,840. 


82p — 82p 
^18 "* ^19 


= 


347,373,600. 


82p — 82p 
^14 ^18 


= 


471,435,600. 


**c„ = »^„ 


= 


565,722,720. 


•^« 


= 


601,080,390. 



„ 



.000008375 
.00004688 



„ .0002110 

„ .0007837 

„ .002449 ■ 

„ .006530. 

„ .01502 . . 

„ .03004 . . 

„ -05257 • • 

„ .08088 . . 



„ 
„ 



1097 
I3I7 

,1400 



2»2 = 4,294,967,296. 



The table just given shows that when 32 coins are placed on a 
table at random, the chance that 16 heads and 16 tails, shall appear 

* Obtained by dividing each term by 2^*. 
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is .14, while it is more likely that either 15 heads (and 17 tails) or 
15 tails (and 17 heads) will be found, the united chances for these 
being .2634. The chance that the divergence from equal division shall 
not exceed 2 (/>., that there shall be at least 14 of each) is .1400 + 2 x 
(.i3i7 + .io97) = .6228; the chance that there shall be as many as 27 
of one kind is only .0001 1, /.^., i in 9,000. 

The Binomial Expansion. 

The following table from Quetelet's Lettres sur la Theorie des 
ProbabilitiSy p. 375, shows a similar calculation when the index 
is 999 instead of 32. For instance, ^^Cgoo (J)®^ = . 025225, the 
first quantity in Column 3. 

As the index of the binomial expansion is continually 
increased, the grouping of the figures takes a definite shape. 
The curve so obtained when the index is indefinitely great is 
called the curve of error. 

In the diagram at the end of the book, the line Ag Fj Fg 
represents the first half of the coefficients of (a+^)*; the line 
Ag Gi Gg Gg G4 Gg represents the coefficients of (a+^)^®; the 
line Ag H^ Hg H3 • • • H^ represents the coefficients of (a+i)^, 
and Aq Aj Ag A3 A4 is the curve of error. To fit these jagged 
lines to the curve of error, the maximum coefficient is repre- 
sented in each case by the line O Ag, and ordinates are drawn 
at equal intervals proportional to the other coefficients ; the 
tops of these ordinates are then joined by straight lines. The 
interval between successive ordinates is decided by the con- 
sideration that the area included between any chosen ordi- 
nate, say Hg Pj, the base P^ O, the maximum ordinate O A,, 
and the line Hg H^ Ag shall be the same fraction of the whole 
area Ag O . . . H^ . . . Ag, as the part of the area of the 
limiting curve of error cut off by the same ordinate is of the 
whole area bounded by O Ag, O X and the curve. 

The algebraic determination of this limit is given on pp.- 
275 seq. 

Suppose now that one ball is taken out of each of n bags, each con- 
taining m^ white and m^ black balls, the chance that r will be white 
and « - r black is — 



"Cr/'. ^-', where/ = ^ and ^ = ^and m^m.^-m^ 

mm 12- 

For r bags may be selected in "C, ways. The chance that each of 



EQUATION OF THE CURVE OF ERROR. 



273 



Scale of Precision. 

999 balls are drawn from a bag containing equally great numbers 

of black and white balls. 

Column I gives number of each colour. 

*> 2 gives rank of deviation from equality. 

m 3 gives probability that balls will be drawn in proportion given in 

Column 2. 
n 4 gives probability that deviation from equality will not be greater 
than that of given rank. 



X 


• 


2. 


3. 

Scale of 
Fhrobability. 


Scale of 
Precision. 


X 


i 


2. 


3- 

Scale of 
Probability. 


Smle of 
Precision. 


Groups of 


• 

S 


Probability 
that such ' 


SuRi of 
probabilities 


Groups of 


1 


Probability 
that such 


Sum of 
probabilities 






tt 


a group will 
be drawn. 


starting 
from most 
probable. 






a group will 
M drawn. 


starting 
from most 
probable. 


White. 


Black. 






White. 


Black. 






499 


500 


I 


.025225 


.025225 


459 


540 


41 


.0009458 


•495278 


498 


501 


2 


.025124 


.050349 


458 


541 


42 


.000^624 


.496081 


497 


502 


3 


.024924 


.075273 


457 


542 


43 


.0006781 


•496759 


496 


503 


4 


.024627 


.099900 


456 


543 


44 


.0005707 


.497329 


495 


504 


5 


.024236 


.124136 


455 


544 


45 


.0004784 


.497808 


494 


505 


6 


.023756 


. 147892 


454 


545 


46 


•0003994 


.498207 


493 


506 


7 


•023193 


.171085 


453 


546 


47 


.0003321 


.498539 


492 


507 


8 


.022552 


. 193637 


452 


547 


48 


.0002750 


.498814 


491 


508 


9 


.021842 


.215479 


451 


548 


49 


.0002268 


.499041 


490 


509 


10 


.021069 


.236548 


450 


549 


50 


.0001863 


.499227 


48Q 


510 


II 


.020243 


.256791 


449 


550 


51 


.0001525 


.499380 


488 


5" 


12 


.019372 


.276163 


448 


551 


52 


.0001242 


.499504 


487 


512 


13 


.018464 


.294627 


447 


552 


53 


.0001008 


.499605 


486 


513 


14 


.017528 


•312155 


446 


553 


54 


.COO0815 


.499686 


485 


514 


15 


.016573 


.338728 


445 


554 


55 


.0000656 


.499752 


484 


515 


16 


.015608 


•344335 


444 


555 


56 


.0000526 


.499804 


483 


516 


17 


.014640 


•358975 


443 


556 


57 


.0000421 


.499847 


482 


517 


18 


.013677 


.372652 


442 


557 


58 


.0000334 


.499880 


481 


518 


19 


.012726 


.385378 


441 


558 


59 


.0000265 


.499906 


480 


519 


20 


.011794 


.397172 


440 


559 


60 


.0000209 


•499927 


479 


520 


21 


.010887 


.408060 


439 


560 


61 


.0000164 


.499944 


478 


521 


22 


.010008 


.418070 


438 


561 


62 


.0000128 


.499957 


477 


522 


23 


.009166 


.427236 


437 


562 


63 


.0000100 


.499967 


476 


523 


24 


.008360 


•435595 


436 


563 


64 


.0000077 


.499974 


475 


524 


25 


•007594 


.443189 


435 


564 


65 


.0000060 


.499980 


474 


525 


26 


.006871 


.450060 


434 


565 


66 


.0000046 


.499985 


473 


526 


27 


.006191 


.456251 
.401809 


433 


566 


67 


.0000035 


.499988 


472 


527 


28 


•005557 


432 


567 


68 


.0000027 


.4999912 


471 


528 


29 


.004968 


.466776 


431 


568 


69 


.0000021 


.4999933 


470 


529 


30 


.004423 


.47"99 


430 


569 


70 


.0000016 


.4999948 


469 


530 ' 


31 


.003922 


.475122 


429 


570 


71 


.0000012 


.4999960 


468 


531 


32 


.003464 


.478586 


428 


571 


72 


.0000009 


.4999969 


467 


532 


33 


.003047 


.481633 


427 


572 


73 


.0000007 


.4999976 


466 


533 


34 


.002670 


.484304 


426 


573 


74 


,.0000005 


.4999981 


465 


534 


35 


.002330 


.486634 


42s 


574 


75 


.0000004 


.4999984 


464 


535 


36 


.002025 


.488659 


424 


575 


76 


.0000003 


.4999987 


463 


536 


^l 


.001753 


.490412 


423 


576 


n 


.0000002 


•4999989 


462 


537 


38 


.001512 


.491924 


422 


577 


78 


.00000014 


.4999990 


461 


538 


39 


.001298 


.493222 


421 


578 


79 


.00000011 


.4999991 


460 


539 


40 


.001110 


•494332 


420 


579 


80 


.00000004 


.4999992 



By means of this scale the binomial 999, practically equivalent to curve of error, 
can be fitted to and compared with any series of observations. 

S 
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>l 



>} 



)f 






these will yield a white ball is/x/x^x ...tor factors, z\e,, f ; the 
chance that each of the other n-r bags will yield a black ball is ^~' ; 
hence required chance is as stated. 

Aliter. 

Call the white balls in the first bag ^^y i^g* • • ' 

DiaCK ,1 y, 1^1 > 12* • • ' 

white balls in the second bag ^v^^ ^w^ 
black „ „ 2^1, 2^2» 

and so on ; then all possible arrangements are represented by the in- 
dividual terms of the product — 

(l^i + 1^2 + . . . + i^m^ + 1^1 + A + • • • + l^m,) ^ 

(2^1 + 2^2 + + 2«'mi + 2^1 + 2^2 + • • • + 2^ni,) X X « factors; 
e,g,j the term ^w^ . g^^ . 3^5 . ^w^ . . . n^m, represents one group. A w 

will occur r times and a b the remaining n-r times as often as the 
term 7tf ^°~' occurs in the binomial expansion of {m^ w + m^ dy [where 
all the w*s are put as w, and all the ^*s as d]. The coefficient of u^ ^~' 
in this expansion is "Q . w^' . w^""'. Total number of possible arrange- 
ments is m\ Hence required chance is — 

w° \m / \m J 

E,g,j to find the probable number of sixes in n throws of dice. 
Here»i = /, m^^i, ^12 = 5,/ = !, ^ = 5. 



ProbabiUty of r sixes = "a (iV {^^ ' 



Suppose II = 1 2. Total number of possible arrangements is 6^^ = 
2,176,782,336. 



12 ! 


sixes occur 


12^12*^ "'S * 


imes = 


I 


II 




C ill ci 
i2Mi-* -5 


)i 


6o' 


10 




C 1 10 c- 
i2Mo'' -5 


» 


1,650 


9 




C 1* c8 

12^9' ' '5 


>? 


27,500 


8 




12^8'* '5 


jj 


309,375 


7 




12^7- 1 '-5^ 


»> 


2,475,000 


6 




12^6'* -5 


>» 


14,437^500 


5 




r t5 c7 


» 


- 61,875,000 


4 




12^'4' 'b 


>f 


193,359,375 


3 






» 


429,687,500 


2 




C l2 clO 

12^2*' '3 


)> 


644,531,250 


I 




12M"* O 


i> 


585,937,500 







12^0'* '5 


» 


244,140,625 






Total - 


2,176,782,336 
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The most probable number of sixes is 2, of which the chance is 
about f . In four-fifths of the trials there will probably be i, 2, or 3 
sixes. 

E^,y to investigate whether drunkenness occurs chiefly on night of 
pay-day (suppose Saturday). If maximum number of convictions in a 
week is on Saturday in 10 weeks out of 12 selected at random, we have 

an event whose probability is only ^ = (about), 

2176782336 1300000 

if the position of the day in the week had nothing to do with it. 

It must be noticed that the probability that event will occur 10 times 

out of 12 on any the same week-day is much greater, viz., . 

Probability that event will occur at least 10 times is %^ — 5- 

= J— which is much the same as before. 

2176782336 

Similarly any questions depending on the occurrence of an event in 
the same month may be worked out. In this case Wi=i, m^^—ii^ 
n = number of years investigated. 

If a bag contains m balls of different colours, tn^ white, m.^ red, 
m^ green, &c., the probability that r^ white, r^ red, r^ green, &c., will 
occur is coefficient of/i'» p>j^* p^^ ... in expansion of (/i"*'A+A + )° 

by the multinomial theorem, where p^ = — i, /^ = —2, &c. 

Notice that the probability of an event, if it was a chance 
occurrence, is not the same as the probability that the event was 
a chance occurrence. 

If 13 trumps appeared in the same hand, we could not say 
that the chances were ("Ci2 = ) 158,753,389,900 to i that the 
hand was "faked," but we should have strong though incom- 
mensurable evidence on the point. 

Deduction of Equation of Curve of Error. 

We can now proceed to the determination of the equation of 
the curve of error. 

The chance of r successes is greatest when r is the greatest integer 
in pn ; this is found by the ordinary method of determining the maxi- 
mum term in a binomial expansion. 

Let P be this maximum value = °Cpn. /p" ^*'°, making the supposition 
for brevity that pn is integral, which will not affect the proof. 

In 

Ipn I qn 
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Let P, be chance of /if + x white balls. 

Then P.= P X (l)\ gnAqn-i)...{gn^.x+i) 

\qj (pn + I) (/« + 2) . . . (pn + x) 



= P X 



\ qn) \ qn) ' " \ qn 7 



Taking logarithms of both sides — 
log p. = log P + log (x -i) +log (.-£-)+...+ log (x -^) 

- '<^ ^^) - '<^ •*■ Jj) -- '<^ ^J^) - '<^ ^^) 

= iogP-(i-+i_i^^+)-(^+i.(j.y+)-... 

\qn 2 (qny / \qn 2 \qn/ J 

- (£zi + i (in?)' + \ 

\ an 2 \ an J / 



VJi 'i'(pnf'^) \pn l\pn)'^) '" (^ ^V^)'*') 

_, p_ I +2+ ...+^-I 1^ + 2^+ ,. , +X- 1 



qn 2q^n^ 

-Itii . .. +^ i -h22-F . . . +:r^ 

/« ^2«2 +••• 

=iog p _'^'^-0_ ^(^+0 _(-y-i)-y(2-y-i)^- y(-y+i)(2^+i) _ ^ 

2^/1 2//1 i2q^n^ \2p^n^ 

Now when «, the number of balls in each bag, is very great, pn and 
qn are also very great if neither / nor q are very small ; jc, the diver- 
gence from/«, ranges from O to qn on the positive side of the maximum, 
and O to -pn on the negative side. The chance of so great a diver- 
gence as -pn or qn is very small. The chance of a small divergence, 
such as Ji:= I, 2, 3 . . . is very nearly equal to the maximum chance P. 
For instance, if ;»:= 3 — 

P = P X /'^y X qn(qn-i){qn-2) 

^ \qJ {pn + i) {pn + 2) {pn + 3) 

-('-f,)(-4)(-^)-(-^)""('%^)"' 

«= (expanding each term) Pxi--^- — + terms involving -i . 

V. nq np ° «2j 



C/3 



< 

> 
u 






M 



C 

H 

> 
C/3 



o 



8 

S 



O 
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Hence, in order that P, may have at once a finite value and one 
with a finite divergence from P, x must be very great compared with 
unity, but small compared with n. 

Re-write the above equation for P^, neglecting f- J, 
logP...o,P-if(l.i).^(i-±.)-|( )*c 

a ^i 

If _ were negligible, log P, = log P. 
n 

If- were finite, and therefore x infinite, both -., and — . would be 
n n'' rr 

infinite. 

That part of the resulting curve, which shows finite curvature, is 

x^ 
found by assuming that x and n are infinite, but — finite. 

n 

[The general argument is similar to that used for constructing the 
finite part of a parabola on a finite scale, for there — is finite.] 

X 

On this hypothesis -5 = ( — ) . ~, -o = ( — ) . -» and these and pre- 

n^ \n / X fir \n / n 

sumably further terms are infinitesimal. The equation of the finite 

part of the curve is therefore — 

logP. = logP-^ii^) 
or P» = P.^ "P**, since/ + ^ = i. 



X* 



Writing^ for P„>' = P(? ■"***'. 

The curve is horizontal near the maximum ordinate, for P, = P when 
X is small, and extends to infinity in both directions, the axis of x being 
an asymptote, for when x ^ + °°, ^^ is zero. 

When -5- is negligible, the curve is very approximately symmetrical ; 

this symmetry may be shown to extend over the finite part of the 
curve, when n is large.* The annexed diagram illustrates the extent of 
the asymmetry for a small value of «. 



♦ By considering the values of the various quantities in relation to the 
table on p. 281. 
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Relation of Curve of Error to Statistics. 

The following example shows how Quetelet fitted his figures 
to given observations : * — 

Chest Measurements of Scotch Soldiers. 



X. 


2. 


3- 


4- 

No. 
between 

given 
Measure- 
ment and 

Mean. 


5- 


6. 


7- 


8. 
Calculated 


9- 


Chest 

Measure- 
ment, 
Inches. 


No. of 
Men. 


Pro- 
portional 
Nos. 


Rank in 

Scale of 

Precision. 


Calculated 
Rank of 
Measure- 
ment. 


Precision 
ofCal- 

culated 
Rank. 


No. of 
Observa- 
tions to 

each 
Measure- 


Differ- . 
ences 

between 
Columns 
3 and 8. 
















ment. 




33 


3 


5 


5,000 


• • • 


• • • 


5,000 


7 


2 


34 


18 


31 


4,995 


+ 52 


50 


4,993 


29 


2 


35 


81 


141 


4,964 


42.5 


42.5 


4,964 


no 


31 


36 


185 


322 


4,823 


33.5 


34-5 


4,854 


323 


I 


37 


420 


732 


4,501 


26.0 


26.5 


4,531 


732 





38 


749 


1,305 


3,769 


18.0 


18.5 


3,799 


1,333 


28 


M 


1,075 


1.867 


2,464 


10.5 


10.5 


2,466 


1.838 


29 


• • • 


■ ■ • 


597 


+ 2.5 


2.5 


628 


« • • 


■ ■ • 


40 


1,079 


1,882 


1.285 


-5-5 


5-5 


1,359 


1,987 


105 


41 


934 


1,628 


2,913 


13 


13.5 


3,034 


1*^75 


47 


42 


658 


1,148 


4,061 


21 


21.5 


4,130 


1,096 


52 


43 


370 


645 


4,706 


30 


29.5 


4,690 


560 


85 


44 


92 


160 


4,866 


35 


37.5 


4,9" 


221 


61 


45 


50 


87 


4,953 


41 


45.5 


4,980 


69 


18 


46 


21 


38 


4,991 


49.5 


53.5 


4,996 


16 


22 


47 


4 


7 


4,998 


-56 


61.8 


4,999 


3 


4 


48 


I 


2 


5,000 

• • • 


• • ■ 


• « • 


5,000 


I 


I 


« • • 


5,738 


10,000 


■ * • 


• • « 


• • • 


10,000 
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The chest measurements of 5,738 soldiers were ranged in 
order of magnitude, and the numbers of men at each measure- 
ment placed in column 2 against the corresponding number 
of inches in column i. Column 3 gives numbers proportional 
to those in column, such that their sum is 10,000. It is assumed 
that there are 5,000 cases on each side of the (unknown) mean ; 
then 5,000 cases occur between 33 inches and the mean, 4,995 
between 34 inches and mean, and so on, till we find (in column 4) 
597 cases between 39 inches and mean. Similarly 1,285 occur 
between 40 inches and the mean. 

Referring now to the scale of precision, we find that 4,995 
cases corresponds to rank 52 ; 4,964 to rank 42.5, and so on. 
The numbers of these ranks are placed in column 5. 

Now if the observations fitted the curve exactly the distances 

* Jfiiif.y p. 400. 
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between two ranks corresponding to two successive inches should 
be always the same. This is not exactly the case : 34 inches 
corresponds to rank 52 ; 35 inches to 9.5 ranks lower, 42.5 ; 
36 to 9 ranks lower, and so on. It is necessary to assume 
some regular interval, which will show as close a correspondence 
as possible between theory and observation. It is assumed that 
a difference of i inch corresponds to 8 ranks, and column 5 is 
" smoothed " into column 6 on this hypothesis. The process is 
then reversed ; against each rank in column 6 is placed in 
column 7 the corresponding number from the scale of precision. 
Column 8 is then calculated from column 7 in the reverse way to 
that by which column 4 is reckoned from column 3. The close- 
ness of the resemblance between columns 8 and 3 shows how 
nearly the measurements fit the theory. In column 9 are placed 
the differences between the numbers in columns 3 and 8 ; the 
percentage which the sum of these differences is of the total 
number (10,000), is a measure of the fit. If the observations are 
plotted out in the same way as the later figures which form the 
L^, Lg . . . figure on the diagram at the end of the book, this 
ratio of the sum of the differences to the whole number, is a 
rough measure of the ratio of the sum of the areas included 
between the lines Lj Lg . . . and the curve of error AjAjAg. 
In the case just discussed the misfit is 4.88 per cent 

The following seems a better method of estimating the fit, 
for it is less dependent on the accidental divergences caused by 
the particular interval of measurement taken. Construct a figure 
to represent the scale of precision on p. 273 ; fit as closely as 
possible to this another figure, whose ordinates represent the 
numbers " at or above " given measurements represented by the 
abscissae ; the whole curve will be nearly the shape of the figure 
facing p. 155, when the smoothed curve may stand for the scale 
of precision, symmetrical about its median, and the original 
jagged line for the observations. The closeness of the jagged to 
the smoothed line shows the fit ; and this will not be altered by a 
slight shifting of the inch or shilling limits we adopt, which in 
the other method often makes a great difference to the regfu- 
larity * in discontinuous observations. Moreover, it is not 

* E.g,y two great numbers at 29s. Qd. and 30s. 3d. respectively will both 
be in the same group if our limits are " 29s. 6d. and not so much as 30s. 6d.,'' 
&c., but in different groups if the limits are "29s. and not so muqh as 
30s.," &c. 
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necessary by the latter method to sort the observations into 
groups at all. 

We cannot deduce this equation from the most general 
hypotheses (stated on p. 303, infra) without the use of the 
integral calculus. It is, however, more convenient to use a scale 
of precision evaluated from the equation of the curve as obtained 
in other ways. In the next table the numbers under x corre- 
spond to Quetelet's "ranks," but the divergence taken as unity 
corresponds to the quantity Jzpqn which is rank 22.35 • • • 
(for V2 X i X i X 999 = 22.35 . .) in Quetelet's scale. The quantities 
under F(;r) correspond to those in the earlier scale of precision ; 
so that against any value of x is found the chance that an 
observation shall be between the average and x. The figures 
are adapted from Lexis' Massenerscluinungen^ pp. 93, 94, and 
Quetelet, ibid,^ p. 389. This table and Quetelet's can be used 
indifferently ; they yield very nearly the same results. 
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Values of F(^) for Different Values of x, where 



jr. 
.00 


F(x). 


X. 

.36 


F(x). 


X. 


F(x). 


jr. 


Fix). 


jr. 

I.4I 


F(x). 


.000 


.195 


•71 


.342 


1.06 


•433 


.477 
^0 


.01 


.006 


.37 


.200 


•72 


.346 


1.07 


.435 


1.42 


.478 


.02 


.Oil 


.38 


.205 


.73 


.349 


1.08 


•^37 


1.43 


.478 


.03 


.017 


•39 


.209 


.74 


.352 


1.09 


.438 


1.44 


.479 

0^ 


.04 


.023 


.40 


.214 


.75 


.356 


uio 


.440 


1.45 


.480 




•05 


.028 


.41 


.219 


.76 


•359 


I. II 


.442 


1.46 


.481 


.06 


.034 


•42 


.224 


.77 


.362 


1. 12 


.443 


1.47 


.481 


.07 


.039 


.43 


.228 


^7S 


.365 


I.I3 


.445 


1.48 


.482 


.08 


.045 


.44 


.233 


•79 


.368 


1. 14 


.447 


1.49 


.482 


.09 


.051 


.45 


.238 


.80 


•371 


I.I5 


.448 


1.50 


•483 




.10 


.056 


.46 


.242 


.81 


.374 


I.I6 


.450 


1.52 


.484 


.11 


.062 


.47 


.247 


.82 


.377 


I.I7 


.451 


1.54 


.485 


.12 


.067 


.48 


.251 


•83 


.380 


1. 18 


.452 


1.56 


.486 


•13 


.073 


.49 


.256 


.84 


.383 


I.I9 


•454 


1.58 


.487 


.14 
•15 


.078 
.084 


•50 
.51 


.260 
.265 


.85 
.86 


:gl 


1.20 

1. 21 


.456 


1.60 
1.62 


.488 
.489 


.16 


.090 


.52 


.269 


.87. 


.391 


1.22 


.458 


1.64 


.490 


.17 


.095 


.53 


.273 


.88 


.393 


1.23 


.459 


1.66 


.491 


.18 


.100 


.54 


.277 


.89 


.396 


1.24 


.460 


1.68 


.'491 


.19 


.106 


.55 


.282 


.90 


.398 


1.25 


.461 


1.70 


.492 


.20 


.III 


.56 


.286 


.91 


.401 


1.26 


.163 


1.72 


.493 


.21 


.117 


.57 


.290 


.92 


.403 


1.27 


.464 


1-74 


.493 


.22 


.122 


.58 


.294 


.93 


.406 


1.28 


.465 


1.76 


.494 


.23 


.128 


.59 


.298 


.94 


.408 


1.29 


.466 


1.78 


.494 


.24 


•'33 


.60 


.302 


•95 


.410 


1.30 


.467 


1.80 


.495 


.25 


.138 


.61 


.306 


.96 


•413 


I.3I 


.468 


1.82 


.495 


.26 


.143 


.62 


.310 


.97 


.415 


1.32 


.469 


1.84 


.495 


.27 


.149 


.63 


.314 


.98 


•417 


1.33 


.470 


1.86 


.496 


.28 


.154 


.64 


.317 


.99 


.419 


1-34 


.471 


1.88 


■''H 


.29 


.159 


.65 


.321 


1. 00 


.421 


1.35 


.472 


1.90 


.496 


.30 


.164 


.66 


•325 


I.OI 


•423 


1.36 


.473 


1.92 


.496 


•31 


.169 


.67 


.328 


1.02 


•425 


'•37 


.474 


1.94 


.497 


.32 


.175 


.68 


.332 


1.03 


•427 


1.38 


•475 


1.96 


.497 


.33 


.180 


; .69 


•335 


1.04 


.429 


1.39 


•475 


1.98 


.497 


•34 


.185 


.70 


•339 


1.05 


•431 


1.40 


.476 


2.00 


.498 


.35 


.190 


1 












2.05 


.498 



jr. 


F(-r). 


2.20 


.49907 


2.50 


.49980 


3.00 


.499.989 


4.00 


.499,999.992 


5.00 


.499,999.999,999.2 



Special Points on the Curve. 

Before we can show how to fit observations to this table, we 
must consider the equation of the law of error more closely. 

Definition. — Th^^ probable error of a series of observations 
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is that divergence from their mean on either side, within which 
exactly half the observations lie. This quantity is more appro- 
priately called the Quartile Deviation* 

In the scale of precision the corresponding number is .25, 
either above or below the mean ; the corresponding rank in 
Quetelet's scale is about 10.7. In the table just given, the value 
of ¥(x) which corresponds to the probable error is .25, and the 
corresponding value of x is calculated to be .47694, a quantity 
usually designated by p. 

An approximation to •the probable error for a given series 
of observations is obtained by arranging all the observations 
in order of magnitude ; marking the magnitude, say a, above 
which 25 per cent, of the observations lie, and the magnitude, 
say Py below which 25 per cent. lie. Half the difference between 
o. and P is the probable error. 

A useful way of illustrating this is to say that if one obser- 
vation is chosen at random out of a group, it is as likely as not 
that it will He between the average and the probable error. 

In the figure given at the end of the book, the probable 
errors are at the points P^, Pg for the curves A and C, and /j, /g 
for the B. 

By means of the approximation given in the last paragraph 
but two, the curve of error can be fitted to a series of observa- 
tions by equating the probable error so determined to the value 
of « = .47694, and comparing the values of ¥{x) with the ranged 
observations ; but though this method is simple and rapid it is 
not the best. 

By a suitable change of scale for ordinate and abscissa the 

equation given on p. 277 can be written _;' = ^~*^ and this is the 
most general equation of the normal curve of error. 

If;r=(7,^=i ; hence the unit of ordinate is the number of 
cases at zero error; from the table on p. 281 above, the unit of 

abscissa is the probable error -r .47694. Since f e ^ .dx is 

shown in the integral calculus to be Ji, the area contained between 
the curve and the axis of x is J'-k, If the equation is written 

y=:z—^,e-^^ its area is i, that is, unit of area equals unit of 
probability (that is certainty) and the area contained by any two 



* See Yule in Statistical Journal^ 1896, p. 330. 
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ordinates, the curve and the axis of abscissa equals the probability 
of an occurrence between the errors represented by the abscissae 
of those ordinates.* 

If the curve is traced from either of the tables, it will be 
found that it changes from concavity to convexity on each side 
of the maximum ordinate, at such points as jj s^ s^ s^ in the 
following diagram. If the unit of abscissa is taken as a concrete 
quantity, say, i inch, and the abscissa (OSg) of this point of in- 
flexion is €, in the same units, then the equation of the curve is — 

I -4 , , d^y -A /x^ i\ = O, if :«: = ± €, and 
therefore the points ( -^» ~~7^) ^^^ points of inflexion. 



X* 



Let 2€2 = ^ = -l., then the equation is^ = —z,e ^^ or^ = — Z(f-x*h' 



c' 



The area of this curve is f -^ .e ^ * dx=c, 

J Jtt 



— 00 



Choose the unit of ordinate so that this area shall be unity. 
Then the equation is — 



x« 



y = — -e *= = — r.^-Mx« 

r, which thus determines the unit of abscissae of the curve, is 
called the modulus. 



Determination of the Modulus. 

Suppose we have a series of observations which we know are 
selected from a group which conforms to the law of error, it is 
required to find, from the observations, the centre and modulus 
of the curve from which they would come with least impro- 
bability. 

Let JTp x^ , , , ^„ be the observed values. Let x be their arith- 
metic mean. Let 5^, S., . . . be the divergencies of the values from 
this mean. 



* Hence chance of error between x and x-\-(ix is —3 ,e~**,dx. 
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Then B^ = x^ - x, 8^ ^ X2 - x, . , . ; ^S ^ 2j"^ - nx = 0, for 

^ 2 "a: 

X = — * — • 
n 

Let the equation of the curve to which the observations belong be 

V = ^. e ^ , where c and ^ have to be determined. 

Let y^y ^2» • • • ^" • • • A be the values of y corresponding to 

^1* X^ ■ • • Xf • • • ■^n* 

Then y,,dx is the chance that an object taken at random from the 
group conforming to the curve shall be between x^ and Jtf-i-dx* 
Let P be the chance that the n given observations occur together. 
Then P = y^. y2 - - - A 

(x,-k)« (xrrJ?)' (xn-k). 



c= „ , c 



n ^ ^P XsaaX.c.saXc 



2r(x-k)= 



I 






Now 2;;(^->^)2 = 5j(8 + a:->^)2 = 2782 + 2.(r - >^). 25 + n.{x - kf 
= 252 + ^(^ _ ^)2^ since 28 = O. 

Whatever value we assign to c, P will be greatest when the quantity 

(x - kf is least, that is when k = x, 
' Giving k this value, 



P = : — 7^ . e ^' 



(^ n/tt)" 



n 2«« 



ifP — n— I — n— 3 _ 

dc ' 

In order that P may be a maximum, -;- must = O.t 

dc 

2252 + 
Hence ^ = + 

Thus the curve required has its centre at the arithmetic 



* When the magnitude of the observations is discontinuous, as in 
Quetelet's scale, no dx is necessary ; but if the magnitude is continuous the 
probability of any defined error is zero, and the y is the chance of an error 
between infinitesimal limits. 

t And -—5- be negative, which can be shown to be the case here. 

dc 
X The proof here given is based on Merriman*s Method of Least Squares ; 
but it is suggested that his statement '^ for a given system of errors, it must 
be considered that the observations have been as precise as possible," § 65, 
is unnecessarily obscure. 
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mean of the observed values and modulus equal to /^- — 

where the 5's are the differences between the observed values 
and their mean.* 

c.so determined, is called the modulus, A =- is called the 

c 

precision. Professor Edgeworth proposes to call ^= — the 

fluctuation,\ 



X9 



It can be shown that half the curves = — j^e *^' is included 

between the ordinates corresponding to ±r, when r=. 47694 c^ 
as found from the scale of precision. 

It can be shown that the arithmetic average (^) of all the 
errors, considered positive, \ is given by — 

7] = -—., whence r = .8453 i], 

kJtt 

rf is the abscissa of the centroid of the positive half of the 
curve. 

rj is more easily calculated from the observations than r, and 
can in some cases be used in its stead. § ^ is called the average 
error or mean of errors. 

In the table given on p. 281, the modulus is taken as i ; 
when jr= i, F (j:) = .42i . . ; that is .421 . . of the curve lies between 
the ordinate corresponding to the abscissa; equal to the modulus 
and the central ordinate. When ;r = 2, F (jr) = .4976 . . Hence the 
chance of an observation showing a divergence from the mean 
on either side of more than twice the modulus = .005 . . ; the 
corresponding rank in Quetelet's table is 45.1. 

< used on p. 283, is now seen to be equal to ^ /— and is 

called the error of mean square^ or the standard deviation, ^ 

♦ See also p. 307, infra, 

t Stat, Journal : Jubilee Number, p. 188 ; and p. 298, infra. 



Sy^' 2Vir* J^el 



x' 



§ Stat, Journal^ toe, cit, 

IF Both € and t] have been called mean error ^ and this term has become 
misleading. 
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In the diagram at the end OPg, 0?^ are the probable errors, 
OMj, OM2 the moduli, OS^, OSg the errors of mean square 
and OEj, OEg the mean errors for the curves A and C. 0/j, 
0/21 0^i> Om^^ 0<?i, 0^2> ^^^ corresponding quantities for the 
curve B. It is to be noticed that the line XOX^ is an asymptote 
of each of the three curves drawn. In the figure OG5 equals 
about twice the modulus. The ratio of the small area to the 
right of a vertical line through G5 to the area of half the curve is 
the chance that twice the modulus shall be exceeded. The 
distance between OX^ at three times the modulus and the curve 
is too small to be shown. 

From the foregoing it will be seen that any curve of error, 

^ = — j=e " *^" • can be obtained by projection from the same 

standard curve >' = ^"''', just as any ellipse can be obtained by 
projection from a circle ; but as ellipses differ from one another 
in virtue of different values of their eccentricity, so curves of 
error differ from one another in virtue of different values of their 
modulus. As we have seen, on any such curve there are certain 
definite points (the positions of the modulus, mean error, pro- 
bable error, and error of mean square) ; if then we have the same 
units of abscissae, such as i inch, for two sets of observations, 
these points will take different positions. If for one set the 
modulus is 2 inches, and for another i inch, .843 of the observa- 
tions will be within 2 inches of the mean in the first case, and 
within I inch of the mean in the second. If we regard the 
observations as attempts to hit the mean and the divergencies 
as errors, the aiming in the second case is ten times as precise 
as in the first. The precision thus defined is in inverse proportion 

to the modulus, and is therefore suitably measured hy A = -, 

If the standard form of equation is adopted in both cases, the 
area of both curves will be unity, and their actual shapes those 
of Cj Cg C3 and B^ Bg By 

The calculation of the precision of a set of observations does 
not require either of the tables, but is simply the evaluation of 

A = a/ v^2 ^^^^ t^^ observations themselves. 

Another form of the equation in common use is ^ = -^^ ^* 
where n is the whole number of observations in question. The 
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area of this curve is «, so that unit area corresponds to one 
observation. 

Compare now this form with that obtained from the limit 

of the binomial expansion, ^ = P.^ "^npq 

U x=o,j^=P ; hence P is the maximum ordinate. 
Adjusting the unit of ordinate so that the area of the curve 



n 



e *"w 



Hence 2»/^=<^, and therefore 2«/(i —p) = (?. 



Examples. 

I. The following figures, which are taken from Professor 
Westergaard's Die Grundzuge der Theorie der Statistik^ but 
treated in a manner different from his, will serve to illustrate the 
meaning of the formulae, and show how to fit a curve to observa- 
tions ; limits of space prevent more elaborate examples. 



Births in Denmark. 





Number. 






Year. '' 






Percentage Boys 
of Toul. 


Difference from 
Average. 








Total. 


Boys. 






i860 


54,797 


28,308 


51.66 


+ .23 


1861 


53.747 


27,506 


51.17 


-.26 


1862 


53.011 


27,300 


51.50 


+ .07 


1863 


53,939 


27,841 


51.62 


+ .19 


1864 


52,884 


27,334 


51.68 


+ •25 


1865 


55,434 


28,483 


51.38 


-.05 


1866 


57,353 


29,747 


51.87 


+ .44 


1867 


54,763 


28,036 


51.20 


-.23 


1868 


56,546 


28,985 


51.26 


-.17 


1869 


54,056 


27,577 


51.02 


-.41 


1870 


56,472 


29,144 


51.60 


+ .17 


187 1 


56,407 


29,045 


51.49 


+ .06 


1872 


57.274 


29,462 


51.44 


+ .01 


1873 


58,616 


30,115 


51.37 


-.06 


1874 


59,324 


30,594 


51.57 


+ .14 


1875 


61,791 


31,784 


51.44 


+ .01 


1876 


63,967 


32,912 


51.45 


+ .02 


1877 


63,772 


32,508 


50.98 


-.45 


1878 


63,144 


32,505 


51.48 


+ .05 


1879 


64,363 


33, "4 


51.45 


+ .02 


Average - 


57,583 


• « • 


51.43 


■ • • 
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Calculated modulus J {2 x 57583 x .5143 X .4857} = 169.61 for 
a total of 57,583. Equivalent to .2945 ... for a total of lOO. 

In the formula Jipqn.p the chance that any child bom is a 
boy is taken as .5143, since 51.43 is the average percentage male 
births are of total births. ^=1— ^ = .4857. The number of 
experiments n is taken to be the average number of births per 

year. Then 'J2pqn is found to be 169.61. This is the modulus 
to apply to the whole number of births ; but since this differs 
year by year it is convenient to reduce all numbers to per- 
centages. The examples are then arranged as " between average 
+ modulus and average — modulus/* " between average + y®^ of 
modulus and average — y^^^ of modulus," and so on ; the first 
group ( + .460) is taken so as to include the extreme. 









Calculated. 


Within. 


Observed. 






jr. 


F (x)xa. 


51.44 ± .460 


20 


■ • • 


.972 of 20=19.4 


51.44 ± .295 


17 


I. 


.843 tt =16.8 


51.44 ± .265 


17 


.9 


.797 " =15.9 


51.44 ± .236 


15 


.8 


.741 n =14.8 


51.44 ± .206 


13 


.7 


.678 n =13.6 


51.44 ±.177 


12 


.6 


.604 m ~I2.I 


51.44 ±- 147 


10 


.5 


.521 » =10.4 


51.44 ± .118 


9 


.4 


.428 // = 8.6 


51.44 ± .088 


9 


.3 


.329 ft = 6.6 


51.44 ± .057 


6 


.2 

• 


.223 n =4,5 


51.44 ± .029 


4 


.1 


.112 It — 2.2 



The numbers to be expected from theory are given along- 
side ; the fit is fairly close, and can be tested by the method 
described on p. 279. 

The modulus calculated by the formula ^- — is .305. 

II. If a digit is taken at random, the chance that it will be 
less than 5 (o, i, 2, 3, or 4) is J. If we take a book of logarithms 
and note the digits in the 7th decimal places in successive 
numbers, we shall have a practically random selection of digits. 
If we take groups of 50 numbers, the chances of o, i, 2 ... 50 
occurrences of digits less than 5 are given respectively by the 
terms of the expansion (i + i)**. This experiment was repeated 
300 times and the results tested in accordance with both the 
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J-^ 



formulae for the modulus, viz., sj2pqn= >/2 x J X J X 50=5, and 

— — , which was found to be 4.75. 
n 

The values of the other standard errors were both found 
directly and deduced from c, with very fair correspondence be- 
tween the two methods. 



Number of Occurrences of o, i, 2, 3, or 4 in the yxH 
Decimal Places of Groups of 50 Logarithms. 























Averages of 






















Lines. 


29 


19 


25 


25 


22 


28 


16 


23 


22 


27 


23.6 


24 


28 


30 


22 


20 


27 


24 


24 


27 


22 


24.8 


27 


26 


28 


21 


21 


22 


22 


27 


25 


25 


244 


25 


28 


21 


23 


22 


23 


27 


27 


25 


25 


24.6 


28 


23 


26 


23 


22 


29 


28 


25 


23 


26 


253 


24 


22 


22 


19 


26 


24 


26 


28 


20 


25 


23.6 


24 


23 


27 


29 


26 


21 


26 


31 


23 


27 


257 


26 


26 


30 


25 


25 


24 


29 


25 


21 


27 


25.8 


30 


24 


25 


27 


24 


30 


24 


28 


24 


30 


26.6 


26 


21 


21 


22 


31 


28 


26 


26 


26 


33 


26.0 


19 


25 


26 


34 


21 


28 


21 


29 


19 


23 


24.5 


29 


26 


19 


29 


24 


27 


25 


25 


24 


22 


25.0 


24 


27 


21 


23 


25 


21 


26 


28 


25 


27 


24.7 


25 


28 


29 


30 


28 


27 


28 


23 


25 


26 


26.9 


30 


30 


18 


22 


24 


23 


25 


27 


25 


31 


25.5 


25 


26 


27 


21 


23 


25 


24 


20 


25 


22 


23.8 


28 


24 


20 


18 


25 


19 


25 


30 


29 


25 


24.3 


22 


27 


24 


28 


22 


20 


23 


25 


26 


26 


24.3 


16 


27 


28 


27 


23 


20 


29 


26 


20 


24 


24.0 


26 


28 


28 


23 


21 


24 


25 


21 


14 


28 


23.8 


25 


24 


21 


24 


24 


• 21 


24 


30 


25 


26 


24.4 


28 


32 


17 


23 


29 


24 


22 


33 


29 


29 


26.6 


25 


26 


26 


27 


22 


20 


24 


26 


24 


24 


24.4 


28 


26 


25 


25 


29 


25 


24 


22 


26 


25 


25.5 


22 


30 


22 


27 


25 


27 


27 


27 


28 


21 


25.6 


35 


26 


23 


26 


31 


28 


26 


22 


22 


29 


26.8 


26 


20 


28 


23 


22 


28 


24 


23 


30 


16 


24.0 


28 


26 


25 


31 


27 


28 


22 


26 


30 


19 


26.2 


26 


27 


27 


18 


25 


24 


27 


22 


30 


30 


25.6 


26 


25 


24 


17 


27 


32 


25 


21 


23 


30 


25.0 



Averages of Columns. 
25.9 25.7 24.4 24.4 24.5 24.9 24.8 25.7 24.5 25.7 
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1. 


X. 


i 
> 1 


^....^.. ^ PfetMiKt ofOJ. a 


PredacsarCoLx 




■ 




dIJBJiCS Of 

EfTor. 


ud Col. > 


and Col. 4- 




Ttma. 










14 oocon I 


-II.G4 


121.88 


-II.Q4 


121.88 


16 M 


3 


- 9^04 


81.72 


-27.12 


245.16 


17 - 


r 2 


- 8.04 


64.64 


-16.08 


129.28 


18 - 


3 


- 7.04 


49^56 


-21.12 


148.68 


19 * 


7 


- 6.G4 


36-48 


-42.28 


255-36 


ao i 


9 


- 5-04 


25-40 


-45-36 


228.60 


21 « 


r 18 


- 4-04 


16.32 


-72.72 


293-76 


22 


r 26 


- 3-04 


9l24 


-79.01 


240.24 


23 . 


r 21 


- 2.04 


4-16 


-42.84 


87-36 


24 ' 


' 32 


- 1.04 


1.08 


-33-28 


34.56 


2S ' 


r 42 


- o* 





- 1.68 





26 < 


' 36 


- .96 


.92 


-•-34.56 


33-" 


27 ' 


' 30 


-J- 1.96 


3.84 


^58.80 


115.20 


28 ' 


r 28 


-r 2.96 


8.76 


+ 82.88 


245-28 


29 ' 


' IS 


+ 3-96 


15.68 


-^ 59.40 


235.20 


30 - 


' 16 


+ 4-96 


24.60 


+ 79-36 


393-60 


31 ' 


' 5 . 


-r 5.96 


35-52 


-t-29.80 


177.60 


32 ' 


p 2 


-i- 6.96 


48.44 


+ 13-92 


96.88 


33 ' 


' 2 


T 7.96 


63.36 


^15.92 


126.72 


34 ' 


r I 


+ 8.96 


8078 


-^ 8.96 


8a28 


35 ' 


r I 1 


+ 9.96 


99.20 


-r 9^96 


99.20 








Siun of Rrran, all 






• •• 


300 1 


• • • 


rtwtsitfwwfn pontivc 


785.12 


3387.96 

Sum of Squares. 



Average, 25.04. Median, 25. QuartileSy 27, 23. Hence probable error is 
approximately 2. 

Mea„error=?^=2.6i7. 



300 
Error of mean square 



=7^^=^^=*- 



300 



Modnlus = < X V2 =4- 75 "^ <•. 

Also probable error =.4769 r = 2.265=r, and mean error =r-r. 8453 = 2. 68, nearly 
the valaes obtained directly. 



The following table compares the distribution with that of 
the normal curve, by a method differing from those previously 
used. 
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I. 



Divergence of i 


from average 


corresponds to x- 


4.75 


.21. 


Average and 


Observed. 


X, 


2F(^). 




Above 


Below 










Average 


Average. 


Average. 


Totol. 








± I 


t 


+ 42 = 


78 


.21 


.117 { 


if 600= 70 


± 2 


66 


+ 74 = 


140 


.42 


.224 


n =134 


± 3 


94 


+ 95 = 


189 


•^3 


•314 


// = 188 


± 4 


109 


+ 121 = 


230 


.84 


.383 


• =230 


± 5 


125 


+ 139 = 


264 


1.05 


.431 


n =259 


± 6 


130 


+ 148 = 


278 


1.26 


•463 


• =277 


± 7 


132 


+ 155 = 


287 


1.47 


.481 


n =289 


± 8 


134 


+ 158 = 


292 


1.68 


.491 


• =295 

• =298 


± 9 


135 


+ 160 = 


295 


1.89 


.496 


±10 


136 


+ 163 = 


. 299 


2.10 


.499 


n =299 


±11 


136 


+ 163 = 


299 


2.31 


.499 


t =300 


±12 


136 


+ 164 = 


300 


2.52 


499 


' =300 



The fit is close. The symmetry is spoilt by the great 
number 42 at 25 occurrences, just below the average. 

The line L^Lg . . . Ljg on the diagram of the curve of error 
at the end of the book shows these numbers. The moduli are 
made to correspond, which defines the abscissae, and the scale 
of ordinates is then decided by making the areas of the two 
figures equal, but this was not done exactly. 

Summary of Terms. 

For convenience of reference the principal quantities con- 
nected with the curve of error are collected below. 

If we take^ = — = . e""^* as the equation of the curve, c which 

determines the unit of abscissas is called the modulus. 

If 8j Sg are the differences between the observations and 

/~22P 

their arithmetic average, c should be taken as ,^ or 



/2252 



, where n is the number of observations. (See pages 



n- 1 
285 and 307.) 

If the curve can be also determined as the limit of an 
assigned binomial expansion (/+^)", then c may be taken as 
equal to J2pqn' (See page 287.) 

- — or - — is called the fluctuatiotiy and equals {?. (See 

H . ft — 1 

pages 285 and 307.) 
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A, = -, is the precision. 

The square root of the average of the squares of the 5*3, 

= a/ — , is called the error of mean square^ or the standard 

deviation^ and is represented by the letters c or <r. 

Hence c=^ V2. 

The arithmetic average of all the 3's, all reckoned as positive, 
is called the average error. It is equal to the distance of the 
centre of gravity of half the curve from the central ordinate. It is 

represented by the letter ?/, and 17=— 7^=.s64i896r. 

The probable error is half the distance between the quartiles 
of the observations. The ordinates through the points whose 
abscissae are the probable errors bisect the two symmetrical 
halves of the curve. The probable error is represented by the 
letter r,and r=. 4769363^. This is generally written r=/v, where 

/> = 4769363. 

Cy A, €, 7/ and r can all be calculated directly from the observa- 
tions. In general the values so calculated will not satisfy the 
above numerical relations exactly ; the correspondence depends 
on the closeness of the fit of the observations to the curve. 
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Section III. — To what Groups does Law of 

Error apply? 

Returning to our discussion on the relation between the laws 
of probability and the numerical facts of actual experience, let 
The mMoing US consider the meaning of such phrases as " a rare 
of mok. occurrence," "an improbable event," "a run of luck," 
" a lucky man," and similar expressions which show that some 
events are regarded as ordinary, others as extraordinary. On 
this subject there is a great deal of popular confusion ; thus 
the Spectator opens its columns to people who write about 
extraordinary coincidences, e.g.^ that on 3rd March in two suc- 
cessive years two persons of the same name died at the same 
age in neighbouring villages ; and recently the concurrence of 
the two names Arthur and Mallory in a dispatch was instanced 
as remarkable. Now, ^ priori^ these two names are just as 
likely to be mentioned together as any other two borne by 
equal numbers of persons. If out of n persons, / bear the 
one and q the other, the chance that the first two names given 

in an assigned place will be these is about— x-; but the 

chance that they will occur together in the newspapers in a 

given week is much greater, viz., -2^ x N, where N is the 

It 

number of pairs of names in conjunction in all the columns 

of the press together. Going a step further, consider the 

number of pairs of names that, when placed together, would 

recall some event of historic or other interest, and suppose this 

to be M ; then the chance that some such coincidence should 

arise in a given week is -^ X N x M, if we suppose for the 

sake of argument that / and q are the same for all the pairs 
concerned. From these remarks it will be seen that before 
we can speak of an event as extraordinary, we must define the 
time, place, circumstances, and nature of such events. Further, 
suppose we decide to regard an event as unusual if the chance 

of its occurrence thus defined is less than -^ where r is a large 

number, it is easily seen that we may expect the improbable, 
to speak paradoxically ; for great though r may be, the 
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number of events which come under our cognizance is also 
great; and we may therefore expect to find on an average 
one improbable event for every r we notice ; hence it is possible 
for a weekly newspaper with the help of the widely-extended 
search for sensations of its intelligence department to supply 
us week by week with its quantum of horrors. Another aspect 
of the same subject will be seen when we deal with the per- 
manence and regularity of certain small numbers in Section IV. 
The rarity of an event is often unconsciously determined 
by a mental forecast of its occurrence. If I take four cards 
no idea of out of a well -shuffled pack and find them to be 
'"^^y- in succession, ace, king, queen, knave of hearts, I 
should feel surprised ; not because these four cards are less 
likely to come than any other four assigned cards whatever, 
but because I have certain associations with them in that they 
form a sequence which is valuable in certain games, and are 
the highest cards of a suit; there are noted in my mind 
unconsciously many groups of four cards of such special sig- 
nificance. If there are s such groups, the chance that one of 

s s 

them will occur is -^pry if we do not, and ^2p-» if we do, regard 

the order of their occurrence. 

The real difference between a rare and a common event 
is, however, independent of any mental process or prejudices. 
If I place 8 coins one after another in front of me, it is no 
more unlikely that I shall get 8 heads than that I shall get 
any other assigned order of heads and tails, say htththtt; 

the chance of either is -g ; but it is much more unlikely that 

I shall get 8 heads than that I shall get 5 tails and 3 heads 
without regard to the order in which they come ; for out of 
(2^^) 256 possible arrangements, only i gives 8 heads, but 

(8! \ 
-7-:= )56 give S tails and 3 heads. 

Apply this argument to our hypothesis as to great numbers. 

Suppose the population to be composed of males and females 

The greatness in equal numbers, and that 1,000 persons are 

^'ij^iitv selected on some system quite unconnected with sex. 

dealt wiui by Out of 2^^=io*^^) possible selections (differing 

the law Of error, f^^^ another only in arrangement in order of 

sexes) only i gives 1,000 males, but ; — ^(= lo'^x 27) arrange- 

'^ ** 5oo!5ooP ^ ** 
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ments gives 500 of each sex, independently of the order. The 

chance of the first occurrence is — ^-,* of the second about 

I in 37. In statistics we are concerned with the totals, depend- 
ing only on the combinations of the items, not on their order 
(the permutations) ; and occurrences of the numbers near the 
average (500, 499, 501, &c.) are separately and much mofe 
conjointly very much more probable than occurrences of the 
numbers far from it *The vast improbability of very great 
divergence can be seen by a numerical study of the curve of 
error (see p. 281). 

Hence the theorems relating to great numbers rest on a 
very much firmer basis than they would if divergence was 
due to that sort of coincidence which produces a so-called 
rare event. 

A " run of luck," good or bad, may be regarded as a suc- 
cession of improbable events, and is a more scientific expression 

A Qommon than a " rare event " as commonly understood. Of 

faiuoy. a great number of events, deals of cards, invest- 
ments, bets, and so on, very many will give normal results, 
average success at cards, normal returns to investments and 
so on ; very few will give abnormal winnings. The chance 
of abnormal success in one venture being /, a small fraction, 
the chance of a succession of n successes is ^, very much 
smaller when n is at all large. It is in the phrase ** lucky 
man" that the error is introduced. One who has benefited 
by the occurrence of a rare event may reasonably be called 
lucky, and the number of lucky men will be roughly proportional 
to the number of fortunate rare events ; but when a succession of 

events, say three, each of probability — , and conjointly of 



100 



probability , or a broken succession (e.j^,, PPQPP of 

^ "^ lOOOOOO ^ ^ r r :ir r 

which chance would be ) has taken place in one man's 

20000000/ ^ 

favour, the imagination loses the logic of the case, and sup- 
poses an overruling law, and marks out that particular man as 
not subject to the law of probability: one is apt to expect 
that the next event will also be a success, and to be 
further confirmed in this opinion by paying attention to 

* Chance of 1,000 m. or r,ooo f. is twice this. 
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the one instance when the sixth event is a success, and 
neglecting the ninety and nine when it is a failure. Other 
people are biassed in the opposite direction, and have dis- 
tinctly too great an expectation of a counterbalancing tendency, 
a long run of failures till the average is restored. It is thus 
correct to speak of a man having been lucky, but tempting 
Nemesis to speak of him as a lucky man. It is a mere 
truism to say that, unless a success or failure have some 
causal influence over future successes or failures (as when a 
good stroke at a game steadies the nerves for another), the 
probability of each future event is totally unaffected by what has 
gone before. 

Let us return now to the method of deducing the chance /, 
and the index «, used in the expansion (p+^Y, from records 

sutiiuoai such as the death-rate. Notice first that the de- 
ooeffloienu. duction of / (the chance to be applied to each 
individual to find the varying degrees of probability of the 
possible totals) from the numbers, implies some hypothesis 
as to the genesis of these numbers, the very theorem which 
we wish it to illustrate ; for suppose that in the records of 
20 years we find 600,000 deaths in a stationary population of 
1,000,000, we assume that this is the most probable number 
which a chance regime would give, and since the most probable 

number is the total x A we deduce that p = x — = -^ ; but 

^ ^ 1 000000 20 100 

here we are making some undefined assumption about the 
occurrence of events similar to that defined by the curve of 
error. If we actually assumed the law of error, we can calcu- 
late how far the value of p so estimated may be expected to 
differ from its true value. This accounts in part for the diver- 
gence between the calculated grouping and the fact. 

Again, there is great difficulty in determining the number «, 
the number of persons to whom the chance of the occurrence of 
a particular event is applied, and we should further notice 
that in many cases, in particular in concrete measurements, 
such as height and normal length of life, we have no infor- 
mation whatever as to «, which in this case is the number 
of causes which may add or subtract undefined units from 
height or age ; and we are often equally in doubt when 
dealing with great numbers, ^.^., with the total value of 
imports. In such cases we should have to deduce both p 
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and n from the records of results ; and indeed it is simpler 
to fall back on other methods- of deducing the law of error than 
the present one of regarding it as the limit of the binomial 
expansion, determining the modulus without any assumption as 
to the number of independent causes. Hence in a great many 
instances we cannot expect to find close conformity to a pre- 
determined curve C/>+^)". Similarly we can deduce from the 
laws of gravitation and motion that a planet's orbit must be 
an ellipse, but cannot determine the eccentricity of this ellipse 
except by observation. 

A far-reaching cause of the apparent discrepancy between fact 
and theory is, however, of a different kind. The theory applies 

to experiments performed under unchanging con- 
appJ^t non- ditions ; if we are drawing differently coloured 
oorreipondenoo balls from a bac^ containing a e^reat number, all 

of flaot ftnd theory. 00 11 

the external circumstances must be unchanged, and 
the only variation that which comes from the so-to-say regulated 
randomness of the forces which decide shuffling and drawing. 
Now in human affairs, when we consider a series of death-rates 
or any other rates distributed in time, we are dealing with a 
constantly changing environment of social and sanitary habits, 
within which the apparently random forces that decide death 
are acting ; and these external changes may affect the inter- 
action of the random forces, just as a change in barometric 
pressure may affect the molecular forces of a rigid body. Such 
effects cannot be foretold or calculated ; we may expect that 
improvements in sanitation will diminish the death-rate, but 
some detail may increase it ; vaccination may diminish small- 
pox, but increase the liability to some other disease. To such 
reasons as these should be assigned the non-correspondence to 
the law of error of great numbers distributed in time. When 
the element of time is eliminated by a process of random 
averaging the correspondence is closer. Great numbers distri- 
buted in space are exempt from this disturbing cause and might 
be expected to show closer correspondence ; for instance the 
birth-rates in a number of districts might be expected to con- 
form more closely than rates for one place for a series of years ; 
but it is very difficult to obtain sufficiently homogeneous 
figures distributed in space ; though Prof. Lexis gives some 
instances of this kind.* 

♦ Massenerschetnungy p. 66. 
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Physiological and anthropometrical measurements, such as 
the heights of 10,000 children of the same age, are not affected 
by these difficulties, and should show close correspondence with 
the theoretical distribution ; and it is not surprising that the ratio 
of the number of male to the number of female births, depending 
as it does on hidden causes not easily influenced by the progress 
of civilisation, should show that remarkable consilience with the 
law of error, which has so often been remarked. Finally, the 
occurrence of sequences and groups of numbers, such as those 
obtained from logarithmic tables, being absolutely independent 
of changes in time or space, naturally show complete agreement 
with theory. 

All these considerations make the application of the law of 
error to actual measurements a very delicate operation, and it 
Tb« UM of tti« n^2iy appear that the cases where agreement is 
law of error, close are SO few as to make the whole body of 
theory useless; but this is an unscientific view to take. The 
general process of applied science is to frame hypotheses as 
nearly consistent with the facts as is possible without such com- 
plications as will prevent their u.se, and then apply to the 
idealized case the corrections which the actual cases necessitate. 
This process has led to the best results in physical science. In 
the problems dealt with by the law of error, it will be found that 
many deductions from the idealized cases hold also when applied 
to the only partially corresponding records of great numbers ; 
just as, in mechanics, many theorems relating to smooth bodies 
can be applied unchanged to rough bodies. For instance, the 
" fluctuation " of non-corresponding figures can be calculated by 

the formula ; and the accuracy of an average of random 

samples of quantities not grouped according to the curve of 
error varies as the square root of the number of samples taken.* 

From this discussion we may gather that we can seldom tell 
d priori whether the law of error will or will not apply to a given 
series of figures. This must be determined by experiment for 
each new class of records ; but when we have found correspond- 
ence in many series of a class (as is the case in measurement of 
heights) we may proceed with confidence to apply the law to 
other similar series or groups. 

An important distinction is drawn by Prof. Lexisf and em- 

♦ See p. 308. t Ibid.^ p. 28. 
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phasised by Prof. Edgeworth * between two classes of figures to 
oonorotemaa- which the laws of great numbers apply. The 
luremonts and first, called by Lexis concrete, contains such quan- 
^^ ' tities as height measurements of a great number 

of persons, and normal length of life, where a definite mean or 
type seems to be normal and other measurements to be varia- 
tions from this type. In these cases it is not easy to connect 
the facts which correspond with the exponential curve ^=^"'*, 
where x is the divergence from the type, with our deduction from 
the limit {n infinite) of {p+qY- Suppose, however, that height 
is determined by n forces, each capable of adding or of subtract- 
ing I unit, say I millimetre, from normal height, and that the 
chance that each shall act is p ; then the divergencies obtained 
in a number of individuals should be distributed according to 
the coefficients in this expansion. 

The other class, called by Lexis combinationaly to which the dis- 
cussion in Section II. above more directly applies, contains those 
totals which are the sum of a great number of items (persons, 
deaths, births, &c.), for the existence of each of which a definite 
chance,/, can be assigned d priori. The numbers may then be 
expected (subject to the disturbing causes already discussed) 



to be grouped in accordance with the curve jf= —==:z- e , 

Jiirpqn 

where n is the total numbers of persons to whom the chance/ 

applies. In such cases/ is the arithmetic average of /^/g . . . 

/n, where p^z^ p^^ . . . p^z are the numbers of events which are 

found respectively in n series each of z observations. / is the 

" probability coeflRcient" of the event, and /^/g, ... A should 

conform to a curve with modulus ^^v' ~^ , On the other 

z 

hand, if c is calculated from the formula — 



c^ 



-5^{(^-Ay-(/-A)v..-(.-^y)='^ 



we are treating the /'s as " concrete " quantities, and obtain a 
second value for the modulus. 

IfV^l^)-. M^) = 0, the distribution- of the co- 

V Z V «- I 

efficients /j, /g, . . . /o is normal, which is not often the case. 

♦ Jubilee Volume oi Statistical Journal^ p. 191. 
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If this quantity <0, the coefficients are grouped more closely 
together than the theory of error leads us to expect, and there 
is some evidence that a force preventing divergence has been 
called into play. 

More generally this quantity is >0, the coefficients are more 
divergent than in accordance with the theory of error, and some 
disturbing forces have acted. 



> 
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Section IV. — The Permanence- of Certain Small 

Numbers. 

A remarkable side-light is thrown on our general ai^ument 
by the actual permanence of small numbers. Little attention 

The binomial ^^ '^^^ given to this phenomenon, but it is a 
ezpaiiBion And very Striking fact that if among a great number 
imauiiiimben. ^j- j^gj^g there are a few which present some 

particular feature, it will be found that this small number is 
seldom much exceeded and seldom entirely vanishes. 

The following numerical example shows that this may 
be expected theoretically, and an examination of the successive 
terms of {p+qY when / is very small and q nearly equal to i 
will show the same phenomenon more generally. 

Constancy of Small Numbers. 

\iooi 1001/ looi**" V / 

1st term = looo***® -r looi**^. 

(|qqq\ 4000 
:^\ = 4000 (3 - 3.000434O = - 1-7364 = 2.2636 = log .018348. 

Chance of 

No occurrences 1st term = a, suppose = .0183 

I „ 2nd „ = 40CX)x iooo"**^iooi*»®=4ii - - - - = .0734 

^ " 1.2 

= 4000x(4000-i) ^ ,ooc^- looi^ 

1.2 
= 8 X iooo**»-r looi*"*, correct to i in 4000= 8a = . 1467 

^ 4000x3999x3998 ^ ^^^^^^^ ,^,4000 

1.2. ') 

^ 4000 X (4006^ - 3 X 4000) ^ ,oqqJW74. iooI«W 
1.2.3 

= —^ — X looo*'*^ -r looi***, less 3 in 4CXX) approx. = , 1956 
1.2. s 

4 „ 5th ,, = Ta'^* l^ss 6 in 4000 approx. - - - - = .1954 

4*" 

5 „ 6th „ = 7L a, less 10 in 4000 ,, • • - - = .1562 

6 „ 7th „ = ^ a, less 15 in 4000 »i - • - - = .1040 

7 M 8th „ = ^ «, less 22 in 4000 „ - - - - = .0593 

4*~ 

8 „ 9th „ = 7u ^1 less 30 in 4000 



>> 



>» *f*-" it 



4th 



»» 



loth 
nth 
1 2th 
13th 
14th 

Terms 15 to 4001 together only occur about i in loooo. 

7 
Chance of 3, 4, 5, or 6 occurrences = — approx. 



= .0296 

.0131 
.0052 
.0019 
.0006 
.0002 



f 
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To take an actual example : — Out of some 530,000 deaths 
annually from all causes the following are the numbers from 
splenic fever in the years 1875-1894: — 

5, 4, 10, 14, 12, 18, 9, 15, 8, 18, II, II, II, 12, 7, 4, 3, 6, 7, 10. 
Average 10. 



Here/ = 



10 



9 = 



52999 



530000 * 53000 

n is doubtful, and may be taken to be the total number 
of deaths or the total population ; but it will be found that 
the following numbers are unaffected, whichever number we 
adopt 

The successive terms in the expansion (/ + ^)*"'*'o are given in the second column. 



Chance of 




deaths is • 


- .000045 


I 




• .ocx)45 


2 




• .00225 


3 




.0075 


4 




■ .0185 


5 




' .037 


6 




.061 


7 




• .087 


8 




■ .11 


9 




.12 


10 




.12 


II 




• .11 


12 




• 09 


13 




• .07 


14 




• -05 


IS 




• .03 


16 




• .02 


17 




• .01 


18 




. .00 


More than i{ 


S 



Number of occurrences 
to be expected in sto years. 

O 



Number 
obsenred. 




0\ 

O 

O 

I / 

2\ 

I 

I 

2-' 

lA 

I 

2 

y 

2 

o 



-I 



Considering the small number of years taken, and the in- 
definiteness of many of the death returns, the general consilience 
between the last two columns is satisfactory ; while the general 
principle that small numbers show a certain constancy is well 
exemplified. Specialists in all professions, from the doctor who 
treats only one obscure disease of the ear, to the dealer in 
curiosities, make their livelihood dependent on this permanence 
of small numbers. 

The regular occurrence of accidents and of improbable events 
in general furnishes other examples of the same sort. 

Note, — Since writing this section my attention has been called to a 
treatise by Dr Bortkewitsch, Das Gesetz der Kleinen Zahlen^ Leipzig, 1898, 
where the close agreement of the records of accidents and other occasional 
events to the binomial expansion is dealt with in a more exhaustive and 
analytical manner. 
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Section V. — Extension of the Law of Error and 

Applications. 

We have only shown so far that great numbers fluctuate 

about their mean in accordance with the law of error on the 

„ ^ assumption that for the existence or non-existence 

Oeneralised ^ • « « 

itatement of of each particular unit there is the same numerical 

law of error, chance /. We can, however, prove by elementary 

methods that the same distribution is reached under many other 

circumstances, and at the same time make several important 

deductions. 

Suppose that a quantity whose mean value is H is deter- 
mined by the action of a* great number of causes ; let the causes 
produce deviations c^, Cg, . . . which are connected with 17, the 
corresponding deviation of H, by the equation ^ = ^i€i+^2*2+ +» 
where a^y a^ . . > are constants. If each of the deviations c^, <2 • • • 
can be of various magnitudes, the curves which show the pro- 
babilities of the occurrence of these magnitudes are called 
" curves of frequency," or " facility curves." If the curves of fre- 
quency are normal curves of error, the chance of the occurrence 

of the deviation Cj, c^ . . . are proportional to e ^*y e~^ 

where c^y ^2 • • • ^^^ ^^^ moduli of these curves. The following 
proof holds when these assumptions are justified ; but the result- 
ing theorems hold (i) when the <'s belong to any curves of fre- 
quency such that their limits are narrow, while the number of 
€*s is great, and the limits of each of the c's is small compared 
with ^, and none of the a's are predominant ; (2) where rj is any 

function of (^^ €1 + ^2*2+ • • •)> such that V = a^€^++ is a first 
approximation.* 

The equation to the normal curve can also be deduced 
directly from other considerations, when we are dealing with any 
quantity liable to continuous small independent variations.! 

We will now show that when the assumptions are limited 

* Adapted from a paper by Professor Edge worth in the London and 
Edinburgh Philosophical Magazine y vol. 34, 1892, p. 429. 

t See, for instance, Chauvenet*s Astronomyy vol. ii., Appendix ; and 
M err! man's Method of Least Squares y chap. ii. 



— ^ ._ %. 
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as aba\c that 7 belongs to a normal curve of error with 
modulus ^"<2;0-r^d-r ,- 

Case L — If H is dttermined by two causes only, if = tf|Cj + a^ 



«» 



Probaiyllitv that th*: Ctr>-iat:ons c, c, c:>nci:r = C^ ^» x^ *^i (where C is a 
constant; = ^eiiminatir^ c,i 



C^ 



This quantity is the proba'o:!ity that a ce\iation Cj occurs with a 
deviation 7. Giving c, ail its possible values in turn, we have pro- 
bability of a deviation 



C| = — ex: 



Now the quantity included in the summation is the whole of a 
normal curve of error, which has been shifted through a horizontal 

distance — i-^~ — *^ and its \'alue depends only on the constants Uj, 
a^ ^i» ^9 ^^^ *s independent of iy and Cj. 



V 



Hence probability of a deviation ri — e ' • -raj-c,- ^ constant, and 
1; belongs to a normal curve of error with modulus y^'a^^-^afi^l 

Case IL — If H is determined by three causes, iy = <»i«i + tf j«j + ^^aCj, 
write T/i for a^t^ + flops' ^^^" V = 'yi + ^3^3- 

By theorem of Case L, modulus for 1/1 is Ja^c^ + ^2%*- 
and modulus for 17 is 



Similarly the theorem can be extended to any number of causes. 

Corollaries. 

1. Let X be the weighted average of the quantities Xy^x^ . . . with 

^ - :iax 
weights a^aj..., so that x^ -v— > 

suppose that the weights are known accurately, but that x^=x\-^€^, 
^2 = •^i + *2» • • • where x\^x\ . . . are correct values, and CjjC^ . . . errors be- 
longing to normal curves with moduli c^^Cj^ . . / I'hen if x^ be the 

correct value of x^ 

- ^(x^-^€) -1 2ae 

X— =A- + v"' 

xa 2^ 
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Hence the modulus of x is ^-^^ — ^^ — '-^, for each term in the above 

ai + a2+ ... 

theorem can be divided by the constant Aj + Og + • • •* 

2. Putting flj= fl!2= ^8 = , and ^1=^2 "^3"^ ~^> ^^ ^^^ ^^ before 
that modulus of an unweighted average of n quantities, conforming 

Jn' c ^ 
to a curve with modulus r, is — — '— = ~p. 

n sjn 

3. If H is the difference between two quantities whose mean values 
are the same, and moduli c^ and c^^ 

rj, the deviation in H, = e^ - Cg ; modulus for iy = >/^+^. 

4. If the two quantities are the averages of ^^ and n^ quantities with 
moduli v^y V2, then by corollary 2, c^^ — i-, ^2= — |-i and modulus for 

difference between the quantities is by corollary 3, / -L + -1. 

V «i «2 

5. In particular, the modulus for the difference between two quan- 
tities from one group, modulus c, is c J 2, 

Corollaries 3, 4, and 5 can be proved directly by the method 
of Case I. 

Precision of an Average. 

The second corollary, that the precision of an average is 
Praoiiioiiorui proportional to the square root of the number 

averaga of terms it contains, is so important that an 
independent proof may be offered, starting from different 
assumptions. 

Suppose a great number {m) of observations to be made 
of a single unknown quantity, e.g.y the declination of a star. 
Let r be the "probable error" of a single observation, h the 
precision of the group, v the true, v+d^, v+d^ . . . z/+rf„ the 
observed values of the quantity. Let Xq be the arithmetic 
mean of the m observed values, and let x^ = v+S, Then S is 
the error of the arithmetic mean. 

Let Sj, Sg, . . S„ be the differences between the observed values and 
^0, so that v+d^ = XQ-{-8^ = v -h^-^S^'y ^i = S + Sj, ^2 = ^ + ^2» ^^' i ^^^o 



* This should be compared with pp. 204-214, supra, 

U 



I 
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I^ Pj be the probabiUty that this set of observations Goociir. Then 
P--Lr^^*' x-^,-^^^ X to m products ^^^-'^^^'^^ > 

^— -**!!■ («+«t)» _^ -b«{«»«+rf2r(«o+ir(»f)} 

= -s-^ X e , since 2.^ (5j) = O. 

Pi is the probability that the observed values yield an error S in their 
arithmetic mean* Let P^ be the probability that the observed values 
jrield no error in their arithmetic mean, then 

p, = p, X <-'*— 

Hence the arithmetic mean belongs to a curve of error whose pre- 
cision is 0/^m = A Jm, ^^^ therefore its probable error is — -p . 

If the errors d^, d^,., occur /i,/^ --- times respectively in the observa- 
tions, while /i + /j + +/■ = «, the foregoing argument is unaffected, 
and the precision of the mean is h Vn, that 'v& ^ J{p\ -^ A ~(~ + /»)* 

Care is needed to distinguish the hypotheses on which this 
formula, and the former formula -^^— connecting weights, 

depend. 

A corresponding result may be obtained directly from the 
limit of the binomial expansion. If an experiment, for whose 
success the chance is /, is performed n times, the most probable 
number of successes is the nearest integer to pn^ and the 
modulus for the various numbers is J2pqn, The modulus for 

the average of the n experiments is therefore ^^^^^ = ^^ ; that 

is, the precision is proportional to Jn, 

We can now obtain a formula for the modulus of a series 
of observations in a form often given. On p. 285 it is shown 
that if Sj, ^2 . . . 5„ are divergencies from their average of a 
series of observations, and if these divergencies conform to a 

law of error with modulus c, then c should be taken as ^y/-^ — 
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and the centre of the curve at the average, for maximum proba- 
bility. Now the average from which these divergencies are 

measured conforms to a curve modulus — p, where c. is the 

modulus of the divergencies measured from their true value, not 
from their arithmetic mean ; 

then, if A is the divergence from the true value, modulus r^, 

3 is the divergence from the arithmetical mean, modulus r, 
d is the divergence of the arithmetic mean from the true 

value, modulus — p, 

A^B + d 
,2 
and ^1^ = ^ + -^, from page 304. 



Hence c^^ = 



n 



«- I «- I 



Since n is large these quantities are very nearly equal ; and 
it is not worth while here to discuss their relative merits ; the 

latter quantity k/^—^ is generally used.* 

As an example of this greater precision of averages, take 
the averages given on p. 289, each of 30 numbers, which 
range on a normal curve modulus 5 ; these averages are 25.9, 

25.7, 24.4, 24.4, 24.5, 24.9, 24.8, 25.7, 24.5, 25.7. General average 
25.05; differe nces, . 85, .65, -.65, -.65, -.55, -.15, -.25, .65, 

t Vy/2 

— -SS* 65; y/ = .89. The modulus for such groups is by 

theory — L^ = .9i . . . 
V30 
From the same page we may find the following 30 averages, 

each of 10 numbers: — 23.6, 24.8, 24.4, 24.6, 25.3, 23.6, 25.7, 

25.8, 26.6, 26.0, 24.5, 25.0, 24.7, 26.9, 25.5, 23.8, 24.3, 24.3, 24.0, 
23.8, 24.4, 26.6, 24.4, 25.5, 25.6, 26.8, 24.0, 26.2, 25.6, 25.0. 

The probable error for these is by theory .47 of -4= = .642 ; 

/s/io 

and between the limits 25.04 ±.64 we actually find 15 out of the 
30 averages, while 6 are below the lower, and 7 above the higher 
limit 

* Vide article by Prof. Edgeworth, Comb. Phil Sac, Tram^ 1885, 
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The modulus for the whole 300 is a/ — * — ^ — ^ = .29 and 

the probable error .14 ; the average for an infinite number would 
be 25 ; for the 300 selected it is 25.043, that is well within the 
probable error. 

Examples of this kind could be multiplied indefinitely. 

Samples. 

The bearing of this principle on the method of samph'ng 
is very important. Our experience on most subjects is derived, 

not by examining all the existing examples, but 
by noting a few which come in our way. A man 
of specialized experience is one who has seen and analyzed 
mentally many cognate phenomena. It needs no proof that the 
more sariiples taken, the more accurate will be the judgment 
formed about the group of which they are samples. Very many 
business transactions are decided by such an examination. Now 
we have seen that the precision of the average shown by samples 
of quantities which satisfy the normal law of error is inversely 
proportional to the square root of their number ; but there are 
three further questions to consider — (a) Whether this rule applies 
to samples of quantities which do not conform to the law of 
error, that is, which would not be obtained from a normal distri- 
bution without great improbability ; (/?) how we are to measure 
the precision of either the original group of which we have 
samples or of our samples ; (y) whether we can learn anything 
more about the original group besides its average. 

a. On referring back to page 303, it will be seen that the 
averages of samples of, say, m quantities drawn at random from 
a large group whose distribution is not normal, will, if m is 
large in relation to the fluctuation of the original group, satisfy 
the law of error. The reason, apart from the mathematical 
analysis of this, is clearer from the following illustration : if 
we have records of a quantity, which fluctuates in accordance 
with the normal law about an average which changes slowly 
year by year, our measurements will not conform to the normal 
law ; but if we select four years at random again and again, 
we shall eliminate the influence of time, and our samples will 
tend to conform. Readers may experiment on the annual birth- 
rates to illustrate this. 
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The following numbers are the death-rates per 10,000 in 
London registration districts, arranged in order of magnitude :— 



70 


100 


113 


120 


130 


141 


150 


160 


170 


181 


191 


204 


230 


252 


323 


70 


107 


115 


121 


130 


141 


150 


163 


177 


183 


194 


205 


236 


252 


329 


yo 


108 


115 


121 


131 


141 


150 


164 


178 


183 


198 


210 


237 


25s 


329 


92 


108 


115 


123 


132 


X42 


151 


166 




18S 


X98 


211 


238 


264 


404 




109 


1x6 


123 


132 


144 


151 


167 






220 




264 


448 






117 


124 


132 


144 


152 


167 








222 




266 


475 






118 


125 


133 


144 


152 


168 








222 




276 


505 






1x8 


126 
126 
127 
128 


136 

138 
139 
139 


144 

145 
145 
147 
148 

149 


153 
154 
155 

156 

158 
158 










223 
228 




284 
286 


622 

625 

1,408 



These numbers clearly do not conform to the normal curve. 
We will omit 1,408 as being so far from the others as to be in 
a class by itself and select at random samples of 4, 18 times. 
Their averages are 174, 222, 226J, 221, 129, 150, 181 J, 193, 300, 
133, 216, 178, 167, 169J, 183, 150, 227, 164. Average, 188; 
modulus, 57.4. These fit a curve of error closely, thus — 



Wi 









Calculated from 






Observed. 


Table on p. 281. 


ithin 5 ofa> 


erage 


2 


1.7 


., 6i 




3 


2.3 


n 10 




4 


3-5 


» 14 




5 


5.5 


,, i8i 




6 


6.3 


» 2X 




7 


M 


» 24 




8 


8.0 


„ 28 




9 


9.2 


» 33 




10 


10.5 


» 34 




II 


10.7 


» 38 




12 


11.3 


•, 384 




14 


II. 8 


„ 39 




15 


11.9 


.. 55 




16 


14. 8 


« 59 




17 


16.7 


„ 112 




18 


x8.o 



Thus the theorem is confirmed in a very unpromising case. 

P, To determine the precision of the average of our samples, 
two methods arc open. The first consists in finding the modulus 

^~\/'~Zr ^^ ^'^ ^^^ quantities chosen; then if the quantities 
conform to a normal curve the modulus of their average is 

- ~= a/-;^ V, and the precision is jJJl; if the quantities do 

^n ▼ n(n — i) c 



3IO 
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not conform this formula still gives the best measure of the preci- 
sik n, but it may be well to confirm it by the second method. This 

method is to break up the n samples into - smaller groups each 

of nty and see if the averages of these groups are such as would 
come from a normal distribution ; if they do not, increase m ; if 
they show signs of normal grouping in a curve of modulus r, before 
we have come to the limiting value of w, then we may expect that 
the larger sample of n things belongs to a normal curve, whose 

modulus is — #^, which may be expected to be equal to 



sfn 



If we ^do not get conformity with the largest value of ;// we 
can take, we have no guarantee that n is large enough to 
eliminate the abnormality of the original figures. 

The following statistics of wages give a practical application 
of this principle. 

Nnmoricai I" the period 1834-45 inquiries were made in 

•»»pi»- the Scotch villages as to the day wages of agri- 
cultural labourers. 

The resulting figures for the Lowlands may be tabulated 
as follows : — 



Numbers at 13d. 
5 


I3id. 
3 


I4d. 

2 


I5d. 
8 


i6d. 
12 


i6id. 
6 


I7d. 
24 


I74d. 



i8d. 
39 


i8id. 
3 




Numbers at iQd. 
27 


2od. 
26 


2id. 
27 


22d. 
15 


23d. 
I 


23jd. 

I 


24d. 
4 


24jd. 

I 


25d. 

2 


27d. 

2 



Average, i8.8d. ; modulus, 3.62d. = ^:. 



Correspondence with Law of Error. 

Observed 







Above 




Below 






Limits. 


Normal. 


Average. 




Average. 




ToUl. 


18.8 ±i<r 


46 . 


27 


+ 


3 


= 


30 


«' 


90 


53 


+ 


45 


= 


98 


\f 


127 


53 


+ 


69 


= 


122 


i<: 


156 


80 


+ 


87 


= 


167 


c 


178 


95 


+ 


87 


= 


182 


\<: 


192 


96 


+ 


95 


= 


191 


ic 


201 


97 


+ 


97 


= 


194 


i<: 


206 


102 


+ 


100 


= 


202 


2C 


210 


104 


+ 


105 


= 


209 
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When we divide the returns into 50 samples of 4 we get 
modulus for their averages 1.8 ; 25 samples of 8 give modulus 
1. 14; 40 samples of 5 give modulus 1.57; 20 samples of 10 
give modulus 1.19. 

The c for the original samples may be found from any of 
these ; the results are — 

Modulus of original samples - - 3.62 

calculated from the groups of 4, 1.8 x sT^ =3.6 

8, i.I4xn/8 =3.2 

5. 1.57x^5^=3.5 
10, i.i9xVio=3.8 



>i li I) If 

11 >> »> i> 

II II II II 



This is a close consilience with theory. We will adopt 3.6 
as the value of c^ then the modulus of the average of the 211 

original samples is — ,-=, its precision " , and its probable 

V2II 3.6 

error .47 of ^ — ='I2 . . . , or A of a penny. 

V2II 

We should verify that the samples conform to the law of 
error: the following shows the comparison for the samples 
of 4: — 

Observed 







Above 




Below 






Limits. 


Normal. 


Average. 




Average. 




TouL 


18.8 ± i of modulus (1.8) 


II 


6 


+ 


7 


= 


13 


B II 


21 


7 


+ 


II 


= 


18 


f II 


30 


14 


+ 


17 


= 


31 


« II 


37 


14 


+ 


23 


s: 


37 


modulus 


42 


18 


+ 


24 


^ 


42 


1 of modulus 


45 


20 


+ 


26 


= 


46 


6 II 


48 


23 


+ 


26 


= 


49 


• II 


49 


23 


+ 


26 


= 


49 


T II 


49 


23 


+ 


27 


= 


50 


2 modulus 


50 


23 


+ 


27 


= 


50 



This resemblance is as close as the argument requires. 

y. If our first samples conform to the law of error we know with 
reasonable certainty the average and the distribution of the original 
quantities — namely, that they conform to a normal curve with 
approximately the same average and modulus as our samples. 
The general average and the sample average differ in accordance 

with a law of error, modulus —7=, where c is modulus for samples 
and n their number. 
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If our first samples do not conform, it is still probable that 
their curve of frequency has a resemblance to that of the original 
quantities. If the fraction /, of the original quantities lay 
between assigned limits a^ and a^, then the number to be 
expected between those limits in n samples is decided by the 
expansion of (A+^iA where A+?i='! ^^ most probable 
number is the integer nearest />,.«, and the modulus b -J^p^^n ; 
similarly if /j, q^ bear a similar relation to a^, a^ the most 
probable number selected between these limits is P^n, and 
modulus •J2p^^, and so on. Thus a similar distribution may 
be expected, and each part of it has a precision varying jointly 
as the square root of the whole number taken and the quantity 
•Jpii—p); thus the larger the number taken the greater will 
be the resemblance, and [since %'/>,( '~A) =" •JPt^^—Pt) when 
/, :> p^ and Pi+Pt "^ '] ^^^ larger the altitude of the area in the 
curve of frequency corresponding to given limits the greater its 
precision. The errors of the various divisions are not, however, 
entirely independent of one another. This is, of course, in 
strict accordance with the common sense of the question. 

The following examples of school ages illustrate part of 
this argument. In a school containing 257 boys of varying 
ages, where the dispersion was not likely to be 
normal, 48 were selected at random and their 
ages written down. 

The modulus of the 48 samples is 43.2 ; their average 13 
years 10 months ; their distribution as follows : — 



Average± \ modulus 



47 M + 26 = 48 

tions are not grouped symmetrically nor in close 
the normal distribution. 

tke the average of random samples we do not 
;ion to the normal curve till the number of 
:s too small to work with. Hence we have no 
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choice but to assume that 48 is a large enough number of items to 
neutralize the want of symmetry in the figures. The average of the 
whole group is as likely as not to be within the limits 13 years 10 

months + — x .47 months, that is between 13 years 7 months 

and 14 years i month. 

Again the quartiles in our samples are at 18 months above 
and 2 years below the average ; the quartiles in the original 
group may be expected to be within the same distances with 
probable errors V2 x J X ^ of 48 =4-2 months, since the chance 
that any quantity shall be between the average and the lower 
quartile is J. 

From a census of the whole school it was found that all 
these conditions were fulfilled; the average was 14 years; the 
quartiles were unfortunately not kept ; but 58 boys out of the 
257 were stated to be over 15 years 9 months, from which it 
is highly probable that the upper quartile was within the given 
limits, 15 years 6 months ± 4 months; and 54 were below 11 
years 10 months, which places the lower quartile also well 
within the limits. 

The principle of corollary 4, the modulus of a difference is 

most useful in comparing two groups selected as having certain 

Prooision of a qualities. Thus Professor Edgeworth * discusses 

diffwenoe. whether an ascertained difference of 2 inches 
between the average heights of a large number of criminals and 
that of the general population is significant ; and finding that 
the modulus for the difference between two random groups is 
only 0.08, holds that there is a cause of the difference in the 
method of selection ; that is, that criminality and low stature 
are found together. We might apply the same principle to the 
investigation of the existence of a period in any figures ; for if 
the modulus of the figures was ^, the modulus for the difference 
between the averages of two random samples of 20 months each 

would be c. /— +— =~f= ; if the difference between the averages 
'^^ 20 20 vio 

of the figures for 20 Decembers and 20 Junes was 3 times 

this quantity the existence of a period would be established. 

For instance,, in the percentage of ironfounders unemployed 

monthly from 1855 to 1874 f the modulus for single months 



♦ Statistical Journal^ Jubilee Number, ibid, 
t See p. 179, supra. 
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IS about 30, and for the difference between the averages of 
two groups one of 20 and the other of 240 is therefore 

30.^-- + — =7 ; but the average of the 20 Decembers is about 

29 **/„ above the general average, a significant difference ; and 
the average of the 20 Augusts is about 19 7o below, a diver- 
gence smaller than before, but still significant ; the difference 
between the Decembers and Augusts, namely, 48, is to be 

compared with the modulus 30 x^^+~= 9, and is therefore 

significant 

A final example may be given which brings into relation 

many of these theorems. The following were the recorded 
Ottntni times for "The Oaks" from 1850 to 1899; we 
ezampio. ^yill discuss whether there has been a significant 

increase of speed, or some change in the conditions of the race, 

or whether the fluctuations are due to minor causes varying year 

by year. 

mtn. sec min. sec. min. sec. min. sec. min. sec. 

870—2 52 1880—2 49 1890—2 40t 
871—2 51 1881— 2 46 1891— 2 54t 

872—2 52 1882—2 49 1892—2 43^ 

873—2 50I 1883—2 53 1893—2 44f 

874—2 48} 1884—2 49 1894—2 50 

875-2 49i 1885-2 43f 1895—2 48* 

876—2 50 1886—2 54f 1896—2 45f 

877—2 54i 1887—2 5of 1897—2 45 

878—2 54 1888-2 42^ 1898—2 45^ 

879 — 3 2 1889—2 45 1899—2 44 



1850—2 56 1860—2 56 

185 1 — 2 52 1861 — 2 44 

1852—3 o 1862—2 49 

1853—2 52 1863—2 54 

1854—3 o 1864—2 47 

1855—2 58 1865—2 51 

1856—3 4 1866—2 53 

1857—2 50 1867—2 54 

1858—2 531 1868—2 47i 

1859—2 55 1869-2 59 

Ten yearly ^ , t o ^ 

averages 2 56 2 51^ 2 52I 2 48 2 47 



These figures fit fairly closely a normal curve of error with 
modulus 743 sees., average 2 min. 50.87 sees. The modulus for 

the difference between two is therefore 7.43 V^ + 1 = 10.48 sees 
The greatest difference between consecutive years is 14 sees., 
between 1856 and 1857; this is not sufficiently far beyond the 
modulus to make it uncommon ; hence there is no proof of any 
sudden change in arrangements having taken place between two 
races. The difference between the times for years early in the period 
and those later is sometimes as much as 20 sees. * The modulus 
for the difference between the averages for two periods of 10 years 

is 7.43 ^- + - = 3.3. The difference between the averages for 
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1850-59 and 1890-99 is 9 sees., which is significant; that 
between 1850-59 and 1880-89 is also significant. The odds 
against such a difference as that between the average times of 
1850-59 and 1860-69 211*^ on\y 13 to i, not very significant 
Hence we find that some cause was at work which gradually 
quickened the race between the fifties and the eighties. 

This method can be applied to the criticism of such serial 
figures as birth, death, and marriage rates, imports, exports, and 

AppUoaUon to ^^ °"* ^^^ ^ periodic series the method can be 
mrietof used first for establishing the period, and then for 

fltonnt oiasBeg. investigation of the figures found when the periodi- 
city is eliminated. With a symptomatic * curve, the method can 
be used for measuring the symptomatic tendency, and then for 
studying the short-period fluctuations. For a series which has 
no symptom and no period, the method is at once applicable 
for finding what divergencies are significant, and for forecasting 
and interpolating numbers. Without some machinery of cal- 
culation of this kind we are unable to get beyond vague and 
gena-al impressions of the existence of a change; f but if we take 
care that the conditions of the calculation are satisfied, we can 
by the method now developed make a definite statement quite 
independent of personal bias, such as "either an event has 
happened, so improbable as to be outside the range of human 
experience, or the decrease shown in the series of figfures in 
question is due to some significant change in the system of 
causes which produce them." 

* See p. 240, supra. 

t We can take an intermediate step by noticing in the above table that 
in nine cases out of ten the times in the decade 1880-9 are less than the 
times thirty years earlier ; the chance that so great an agreement in the 
direction of the change (irrespective of its magnitude) should come in a 
random selection is tUt or .0215 ; the chance as calculated above is .0006. 
See Edgeworth xn Jubilee Volume of Statistical Journal^ pp. 213-217. 
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Section VI. — The Theory of Correlation. 

It IS never easy to establish the existence of a causal connec- 
tion between two phenomena or series of phenomena; but a great 
oaoiai deal of light can often be thrown by the applica- 

oonnoouoii. ^Jq^ ^f algebraic probability. We have already 
dealt with some cases in point ; we have shown how to find 
whether an event is due to a special cause, or whether it 
naturally arises from the variation of existing causes ; we have 
shown how to measure the significance of the difference between 
two quantities or two averages ; and further, we have investigated 
such problems as the influence of the seasons.* In many large 
groups of phenomena we can apply a more refined and more 
certain method, which it is our object to introduce in this 
section. When two quantities are so related that the fluctua- 
tions in one are in sympathy with the fluctuations of the other, 
so that an increase or decrease of one is found in connection 
with an increase or decrease (or inversely) of the other, and the 
greater the magnitude of the changes in the one, the greater 
the magnitude of the changes in the other, the quantities are 
said to be correlated. Correlation is a quantity which can be 
measured numerically; and its measurement has been the subject 
of much recent mathematical investigation. 

Let two variable quantities X, Y be subject to variations x^y^ 
Thooomuuon which are due to a multitude of individually unim- 

iiirfaoo. portant causes, producing fluctuations e^^ ^2 • • • ^ S 
... so that the ;r's are connected with the e'% and theys with 
the c's by the equations. 

^ = ^1 €j + ^2 €2 + + ^n ^n> where a^y Ag • • • ^i> ^2 • • • ^'"^ constants. 

Then x and y conform to normal curves of error, whose 
moduli we will call r^ c^ 

The rest of our investigation which is based closely on 
Professor Karl Pearson's paper on " Regression, Heredity, and 
Panmixia,"! proceeds on the assumption that the ^'s and c's 

* See p. 186, supra, 

t Transactions of the Royal Society^ vol. 187 (1896), A. 253-318. 
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conform to normal curves of error, which is not, however, the 
most general condition. 

Let individual values of X and Y, x and^ respectively, be 
grouped in pairs, as measurements of two quantities at the same 
date, or of two parts of the same organism, or in any other way. If 
X and j^ are quite independent, none of the causes producing them 
are common to both, and the ^'s are independent of the <*s in the 
above equations. Then z^ the chance of divergencies x and y 

concurring = ^ ^' x ^ ^* X (a constant). 

For any one value of ;r, the quantities y are grouped about 
the mean value Y, in accordance with the normal curve c^ ; and 
similarly for any one value of ^. 

The above equation may be written z^Qx ^^ ^^ . If we 
give z any definite value ky the ^s andys which have jointly the 
probability A, are connected by the equation 






which is the equation of an ellipse having its principal axes 
coincident with the axes of measurement of x and y^ if we 
suppose ;r and^ measured on two horizontal lines perpendicular 
to one another. Let 9 be tneasured vertically ; then in the 
surface given by the equation connecting ^, Xy y all the hori- 
zontal sections are similar ellipses, whose projections on a 
horizontal plane are concentric and similarly situated,* while 
all the vertical sections are normal curves of error.f 

This is the surface of no correlation. 

If, on the other hand, any of the ^'s coincide with any of the 
c's, it may be shown that a new term is introduced in the 
equation between ^, ;r, and y^ which becomes 

^ _ « I ^ Ui« c,c,+c| /i-r' 

* — . - - ■ »c 



♦ For a diagram of this projection and for a general discussion of corre- 
lation on the same lines as this chapter, but more advanced and complete, 
see Mr Udny Yule's paper on The Theory of Correlation in the Statistical 
Journaly Dec. 1897. 



t For the section by a plane j' = »ix + « is ar = C^~^ ci* c," /, 
may be written k = e'^ ^*'**®^' x D where A, B, and D are constants. 



which 
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where n is the number of pairs of observations and r is a 
quantity we have still to decide ; this is the general equation 
of the normal correlation surface. The horizontal sections, 
obtained by giving z different constant values, are now of the 
form 



•^' ^^^ A. f 



- 2r,-~^ + ^ 



V« 



-5 = /, a quantity independent of ^ and^'. 

fro 



The projections of the horizontal sections are still concentric, 
similar and similarly situated ellipses, but their principal axes 
are now inclined to the axes of x and^. The vertical sections 
are still normal curves of error with various centres ; in particular 
the frequencies of the values of y found in conjunction with x^^ a 
particular value of Xy are given by the equation 

2 = tf \cf c,c, ci/i-r« X constant 
= e ^i <»-'*> ^ <^> ' X constant. 

This is a normal curve of error with its centre at r.-^x,. 

^1 
Thus the mean value of y corresponding to Xj, a given value ofx^ is 



r.-2x,. 
^1 



These mean values all lie on the line ^ = r.- . 

Cg C| 

Similarly the mean values of x^ corresponding to given values of y^ 
lie on the line — = r.^ . 

Cj Cg 

r IS called the coefficient of correlation. If r is positive, for 

every given value of or, the mean value of the corresponding ^s is 

The ooeffloient positive and a definite fraction o( x\ if r is negative, 

of oorreiatton. ^^g correlation is said to be negative, and for every 

given value of ;r, the mean value of the correspondingys is a 

definite negative fraction of ;r. 

To determine the value of r, we must observe that this single 
quantity determines the shape of the whole surface, when N, r^, c^ 
are given, just as the modulus determines the shape of the curve 
of error. We decided the best value of the modulus* by con- 
sidering from what curve of error the observed values would 
arise with least improbability. Professor Pearson finds the value 

* See p. 283, supra. 
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of r by considering from what distribution of ^, x^ and y (i.^., 
from what surface of correlation) the observed pairs would 

arise with least improbability ; r is thus found to be ^ ^' or 

-•^'f — . — j, the summation being extended over all pairs of 
;r, yy where o-j, o-^ are the errors of mean square of the ^s and 
y's respectively, and hence 0-^= -^, o-g =— 7=-. 

But with other values of r the observed pairs might have 
been obtained with greater or less improbability, and these 
values are distributed in accordance with a normal curve of 

I ^ f^ 
error whose probable error is .67 — ^- ; * that is, when from all 

the possible correlation surfaces, which might have resulted in the 
observed values, those whose correlation coefficients are within 

the limits r ± .67 — — are selected, the sum of their pro- 

babilities is \. 

It will be useful to examine the limits of the possible values 
of r. 

r always lies between + i and - i. 
For «2 ^1 ^ ~ (^^)^ = Sjc^.Sy - (2jcy)2, since o-j = ^ — , o-^ = ^— 

= i>\)\ + Ay\ 4- +) (^ + j>« + +)-(^iJ^ + VI + +)'. 

f where (^i^'i) (x\y\) ... are pairs of observations, and -i = X^ -5 = Xg . ,. j 

^ y\y\{\-\)^^ + 

which is zero if Xj = Ag = X3 = = , but otherwise positive. 
Hence w^crjo-j - i^xyf is positive, 

I >r^ 
and r is between + i and - i, except when A^ = Ag == A3 = ; in 

this case r = ±1, and the correlation is said to be perfect, positively 
or negatively. 

* See Pearson, loc. cit., p. 226 ; Yule, loc, cit^ p. 847 ; and Pearson, 
Proceedings of Royal Society y Oct, 1897. By a similar line of reasoning the 

probable error of c as determined on p. 283 is found to be .477—7=. 
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It may be noticed that on d priori grounds without any 

mathematical investigation the formula i (— ^.-^ + ^.-^^ + + ) 

gives a good measure of correlation. For if there is positive 
correlation, whenever we have a positive value of x we may 
expect a positive value of j/, and whenever we have a negative 
value of X we may expect a negative value of 7, and each such 
term increases the coefficient ; while, if there is no correlation, 
for any value of x occurring several times, we "may expect 
positive and negative values of ^ which on the whole give a 
very small sum. Meanwhile the denominators at once bring 
the deviations into relation with the mean deviations, and pre- 
vent the whole coefficient becoming greater than unity. 

We see then that r measures the correspondence between 

deviations from their means of the two series of observations. 

The meMvromont If the deviations are in exactly the same ratio for 

of oorreiation. j^jj pairs, the correlation is perfect, and r=l ; while 

r tends to zero when for a given deviation in one of the series 
we have excess and defect with equal frequency in the other. 

r serves as a measure of any statement involving two quali- 
fying adjectives, which can be measured numerically, such as 
"tall men have tall sons," "wet springs bring dry summers," 
" short hours go with high wages." 

I -r^ 
When r is not greater than its probable error ,6y — =r- we 

Jn 

have no evidence that there is any correlation, for the observed 
phenomena might easily arise from totally unconnected causes ; 
but when r is greater than, say, 6 times its probable error, we 
may be practically certain that the phenomena are not indepen- 
dent of each other, for the chance that the observed results would 
be obtained from unconnected causes is- practically zero. 

The calculation of r is quite simple, and if we can assume 
normal dispersion, so that the probable error in a series is 
equal to .6y of the error of mean square,* can be performed 
very rapidly. In the following tables the correlation between 
the prices of wheat, foreign trade, and the marriage-rate, already 
discussed by the help of the graphic method, is investigated. 



Hence <7 = about J of distance between quartiles. 
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Examples of Correlation. 
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'Rate and Price of 


Wheat. 
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Sxr 


t= - 


445 





Correlation between marriage-rate and 
the price of wheat — 

0-1 = .580 0-2 = 133 



r = 



445 



„ = - -29 
20 X 133 X. 58 

Probable error of r — - '29 



Correlation between marriage-rate and 
imports and exports — 

o-j = .580 0-3 = 90 

2xy = 8 
r = + .007 

Probable error of r — .15 



,A\ 



\ 
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875-94. 

87s 
876 

877 
878 

879 

880 

881 

882 

883 

884 

885 • 

886 

887 

888 

889 

890 

891 

892 

893 

894 
Av. 

Correlation between marriage-rate and 
the price of wheat — 

2xy r= 627 

[Or distance between quartiles = .9, 
whence c, = .67] 
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Srr = 


627 





r = 



20 X 102 X .651 

Probable error . i 



= +.47 



Correlation between marriage-rate and 
imports and exports — 

o-j = .651 0-3 = 41 

r=+.25 
Probable error of r = .14 



Hence there was slight negative correlation between the 
marriage -rate and price of wheat before 1864, that is, the 
marriage-rate fell when wheat rose; but since 1864 there is 
better evidence that the marriage-rate rises when wheat rises. 
The marriage-rate and foreign trade were quite uncorrelated 
before 1864, and show only slight correlation at more recent 
dates ; the odds against the correspondence between the ob- 
served figures, since 1875, arising without causal connection are 
only about 4 to i, if we assume that the figures for each year 
are independent of the next. 

An earlier method of estimating correlation, introduced by 
The Gaitonio Mr Galton,* is very useful for a rapid survey of 

method. .^^Q groups of figures. As a simple example 
adequately illustrating the method, we will take two series 



* See Proceedings of the Royal Society^ 1886, vol. xi., Family Likeness 
in Stature, 



CORRELATION OF DAILY MAXIMA AND MINIMA OF 

TEMPERATURE IN 1898. 
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where the correlation is likely to be great, namely, the daily 
records of maxima and minima temperature recorded for 1898.* 

We first make a rapid survey of the series, and notice that 
the minima range from about 23° to 63°, and the maxima 
from about 35° to 95°. Divide each of these ranges into, say, 
10 equal parts, and draw up the framework of a table hke 
the annexed. Turn through the records and enter the maxi- 
mum and minimum for each day by a dot in the appropriate 
place ; thus on 4th October the maximum was 61.3° and the 
minimum 51.3°; a dot should be put in the row 60° -64.9° under 
the heading Si°-54.9°. When all the dots are entered, replace 
them by their number in each square. The table shows the 
result for 358 days. If there is correlation, it will be found 
that the medians, or arithmetic averages, of each row form an 
orderly progression, and similarly for each column. These 
medians arc roughly estimated and given in the table. 

To test the correlation of the minima relative to the maxima 
a diagram is drawn. Choose scales so that the di.stance between 
the quartiles of the maxima (18°) shall be represented by the 
same length vertically, as represents the distance "between the 
quartiles of the minima (14') horizontally. Place crosses hori- 
zontally level with the middle points of the successive limits 



2si- 

J. 



* Whitaker's Almanack, 1899. 
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of maxima and vertically above the positions on the scale of the 
medians of the corresponding minima. 

Now draw two lines. The first through the positions of the 
quartiles and median (Q^, Qg, M) ; this is the line of perfect corre- 
lation, and with the scales we have chosen is at 45° to the hori- 
zontal ; draw another line through M, passing as near as possible 
to all the crosses. Draw any horizontal line PCN intersecting 
the former lines as in the figure. The ratio of CN to PN is the 
coefficient of correlation. If the line CM passes through all the 
crosses and coincides with PM, the correlation is perfect If 
CM is perpendicular to PM, there is perfect negative correlation. 
If CM is vertical there is no correlation. In the figure the ratio 
CN to PN is j. A rough test of the presence of correlation is 
to be obtained by noticing whether all the crosses above the 
median are on one side of PM and below the median on the 
other side. 

There is a simple connection between the coefficient thus 

determined and that obtained by the previous formula. On 

RAUtion between ^^^ diagram the scales are so chosen that we 

replace — , ~ by quantities f , rj measured by equal 

units. Then if (f^ ^J (^g ^2) • • • ^^^ ^^^ positions of the 358 
original pairs the line y^rx can be shown to be that whose 

mean distance from these points is a minimum when r=-^, its 
value previously given. It is easily seen that in the figure the 

CN 

ratio p^ is r^ \{ y = r^ jc is the equation of a line through M 
referred to horizontal and vertical axes. Hence the line CM 
might be drawn from the original formula by taking p^ = ^^ ; 

in other words, we have here a graphic method of finding the 
coefficient of correlation. 

Calculating r roughly from the data 0-^= 12.7, 0-2 = 9.1, ^ = 358, 

^ = 32 1 30, ^= 3^8 x^aTx 9. 1 "^ '^ approx.; that is, we obtain ap- 
proximately the same value by either method. 

Mr Galton applied this method to the question of inheritance 

of stature. He found that the correlation between the statures of 

Ranaiii children and of their parents was |. That is if a group 

of parents had an average stature x inches above (or 

below) the general average, the average for their sons was only \x 
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inches above (or below) the general average. This return towards 
the average is called in biological language "regression," and hence 
the coefficient of correlation is often spoken of as the " coefficient 

of regression," and such an equation as ^ = k-^jv is called the 

" equation of regression." In words this equation is : the ratio 
of the divergence of one quantity from its mean value to its 
standard deviation equals the ratio of the divergence of a cor- 
related quantity to its standard deviation, multiplied by the 
coefficient of regression. 

There is an intimate relation between the law of error and 
biological theory. The law of error and other cognate laws 

give algebraic expression to the universal ten- 
dency to variation, whether we are dealing with 
any part of the social organism to whose measurement we have 
in this book limited statistics, or with any measurable organ of 
an animal or vegetable. The law of heredity can be only tested 
numerically by the theory of correlation ; the effect of natural 
selection is easily considered with the help of the coefficient of 
regression. For if there is no selection, the distance from the 
general average of the mean stature of successive generations, 
descended from a group whose mean deviation was jtr, will be 
rx, r^x . . . f^x if r remains unchanged, a series whose terms 
rapidly tend to zero. If on the other hand a selection is made 
in each generation of those above the average, the divergence 
can be preserved and intensified. The discussion of this point 
would lead us too far afield. 

In this Second Part we have' only discussed the elements of 
the subject, the theorems and formulae which writers on statistics 

now assume. We have examined only the normal 

OonoliuloiL ^ , , 111 

curve of error, and have not touched the asym- 
metrical curve of error, or algebraic formulae arising from 
different hypotheses, or correlation between more than two 
variables. In the region to which we have confined ourselves, 
however, we have had to deal with arguments of the same 
nature as are to be met with in the higher paths of statistics. 
The great difficulty which the student of economics encounters 
when dealing with the theory of error is the apparent slightness 
of relation between this theory and the facts with which he 
deals. This slightness is only apparent ; it is because, the 
theory has not, in the form he meets it, been carried far enough 
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to fit it to the very complex facts of human affairs that we do 
not get that exact correspondence we might desire. The 
theoretical distribution of error may be expected to underlie 
all phenomena, just as the attraction of gravity underlies the 
action of all machinery. We cannot explain the motion of 
machinery by gravity alone, we need to consider also other 
natural forces, not so easily measured as gravity ; but still less 
can we explain that action if we ignore the force of gravity. 
It is hoped that the short treatment here given of the 
elements of so important a subject may make smoother the 
approach to a field of investigation where there is great promise 
of harvest but where the reapers are as yet few. 

Note, — While this book has been in the Press, an article by Prof. Pearson 
has appeared in the Philosophical Magazine^ July 1900, violently criticising 
the method adopted by most of his predecessors, who have investigated the 
applicability of the Law of Error to Statistics, that is to say, the method of 
first deducing the equation from d, priori considerations, and then comparing 
the results with experiments. By means of a criterion of ** fitting," which 
should be carefhlly studied, he shows that the chances that the statistics, 
with which Airy and Merriman illustrate the theory, would have arisen 
from random sampling are only .01423 and .000,00155 respectively on their 
hypothesis, and deduces " that the normal curve possesses no special fitness 
for describing errors or deviations such as arise either in observing practice 
or in nature." It is to be remarked on this, first that the investigation of 
two examples does not prove his case, secondly that his criticism does not 
apply to such curves as the asymmetrical curve of error treated exhaustively 
by Professor Edgeworth, and thirdly that the claim of the authors, whom 
he treats with such contempt, is not that the fit is exact, but ''that the 
formula represents with all practicable accuracy the observed frequency" 
(Airy, quoted by Pearson) or "that the agreement is very satisfactory^ 
(Merriman) : thus the authors in question make no claim that the normal 
law is the complete explanation of the observed errors, but are satisfied with 
the approximation they found : it was not to be expected that the pioneers in 
the field should attain finality. By a similar process the law of gravitation 
might be treated with derision by criticising the experiments of an Attwood's 
machine, when the resistance of the air was not considered. Prof. Pearson 
has four constants in the curve by which he attains a close fit in his Illustra- 
tion IV., and by increasing the number of his constants might obtain an 
absolute fit. With those developments of the normal curve of error, which 
depend on hypotheses very similar to those used by the earlier writers (see 
p. 303, supra^ and Professor Edgeworth's recent contributions to the Statis- 
tical Journal\ more constants are present, and there is every likelihood that 
equally close agreement may be found. The present author does not, however, 
wish to enter here into the controversy as to which is the best formula for 
classifying phenomena. His intention has been to follow in the beaten track, 
and there can be little doubt that the ordinary reader will prefer to find some 
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li priori justificsLiion for the unfamiliar theory that natural phenomena can be 
represented by the formulae of algebraic probability, /<ir^ the author of TAe 
Grammar of Science^ though he may recognise that the ultimate justification 
for the theory must be experience. There is no suggestion in this book that 
the whole of nature can be measured by the foot-rule of the normal curve of 
error ; but yet that it may be a useful instrument has been shown by few 
people more conclusively than by Prof. Pearson himself. 

In the following list will be found those books and articles relating to the subject 
of Part II. of this book which are most accessible and likely tu be most useful to the 
English student. Further references to foreign authors and to earlier writers will of 
course be found in the works here mentioned : — 

ToDHUNTER, l.^History of the Theory of Probability, Especially Arts. 993-1002. 
ENCYCLOPiEDiA Britannica. — Article on Probability, 
Dictionary of Political Economy (Palgrave's). — kx\Sx\^<m Law of Error, 
Galton, F. — Inquiries into Human Faculty and its Devehpmettt, 

Natural Inheritance, 

Family Likeness in Stature, Proc. of Royal Soc., 1886, x888. 

Merriman, M. — Method of Least Squares, 

Chauvenet, — Practical and Spherical Astrofwrny^ vol. ii., App. 

Edgeworth, Prof. F. Y. — In the London, Edinburgh and Dublin Philosophical 
Magazine and Journal of Science (formerly issued under other siqsilar titles, 
and known as the Literary and Philosophical Magazine), 5th series. 

. Vols. 21, 22, 23, 24, 25, 30. Various investigations and examples. 

Vols. 34, 35, 36. Correlation. 

Vol. 41. Asymmetrical law of error. 

— In the Jounicd of the Royal Statistical Society^ 1886 and Jubilee Volume. 
Methods of Statistics. 

1888 and 1890. Chance in competitive examinations. 

1893 and 1894. Correlation. 

1895. Recent contributions (Pearson's) to theory. 

1896, 1897, and 1898. Miscellaneous applications of the Calculus of Probabilities. 
1899 and 1900. Representation of Statistics by Mathematical Formula:. 

Report of Committee of British Association on Monetary Standard, 1888. 

Camb. Phil. Soc. Trans., 1885 and 1886. Merits of various means. 

[The exact titles of the above articles may be found from the indexes of the 

volumes mentioned.] 
Pearson, Prof. K. — The Chance of Death and other Essays, 

The Gramtnar of Scietuey chaps, x., xi. 

Contributions to Mathematical Theory of Evolution in Transactions of Royal 

Society, 1894, 1895, 1896, and Stat. Soc. Journal, 1896, 1897. 

Probable errors of frequency constants. Royal Soc. Trans., 1898. 

Criterion , . . of Deviations , , , in a Correlated System , , . Phil. Mag., 

July 1900. 
Venn, Dr J. — The Logic of Chattce, 

Nature and Use of Averages, Stat. Journal, 1891. 

Cambridge Anthropometry, Journal of Anthropological Institute, Nov. 1888. 
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Yule, M.— History of Paup€rism, Stot. Journal, 1896. 

Theory of Correlation, D<x 1897. 

Changes in Pauperism, Do. 1899. 

Association of Attributes in Statistics, Royal Soc. Trans., 19CX). 

BowLEY, A. L. — Accuracy of an Average, Stat. Journal, 1897. 

Shbppard, W. F. — On the calculation of the Average Square, Stat. Journal, 1897. 

Use of Auxiliary Curves, Stat. Journal, 1900. 

NormcU Correlation, Camb. Phil. Society, vol. xix. 

Normal Distribution and Correlation, Royal Soc Trans., 1898. 
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The thick line Ao Ai Aa A3 A4 represents the normal 
curve of error. 

Ci Ca C3 is a curve of error with the same unit of 
abscissae as Ax Aa A3, but with ordinates diminished in the 
ratio 4 to I. 

Bi Ba B3 is a curve of error with both ordinates and 
abscissae half those of Ai Aa A3. 

The areas of Bi.Ba.Bjand Cx.Ca.Cjare equal; but the 
modulus of the former is half that of the latter, and it 
represents observations of twice the precision. 

The area contained lietween the vertical lines through 
Pi, Pa and the curve Ai Ay A3 and X O Xi is half the area 
between the curve and X O Xi ; similarly for Ci Ca C3 ; 
similarly with lines through pi, pa for Bi Ba B3. 

Pi, Pa, pi, Pa, are positions of probable errors. 

Mi, Ma, mx.roa.are positions of moduli. 

Ei, Ea, ei, Ca, are positions of the average errors. 

Si.Sa.are positions of errors of least square, and Si,Sa,S3,S4, 
are points of inflexion. 

Aa Fi Fa represents half the binomial (i-!i)* 

Aa Oi Ga G3 G4 G5 . (a'J'i) ° 
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Accuracy," 199-214. 

Age, 29, 147, 251, 312. 

Agricultural Wages, 50-52, 97-103, 109, 
no, 115-118. 

Arithmetic Average or Mean, 107- 1 10, log^ 
125, 126, 128, 129, 130, 136, 221 ; 
error in, 204, 306. 

Average error, 28^^ 2^2, 

Average wage, 6, 1 1. 

Average : Precision of, 305-308. 

Averages, 7, 19, 89, 92, 95, 107-130, /^o, 
i33-'40. I43» 214, 264; see Arith- 
metic Average^ Median^ Mode^ Weighted 
Average, 

Bertillon, Dr J., 17, 129, 130, 158. 
Bias, 118. 

Biassed errors, 209-214, 219. 
Bibliography : of Interpolation, 258 ; of 

Law ol Error, 327. 
Binomial Expansion or Theorem, 265, 

272-277, 288, 291, 301. 2, 306. 
Births, 287. 
Blank Forms, 18, 19, 26, 27, 30, 37, 42, 

46, 63 ; specimens of, 23, 35, 36, 45, 

48, 51. 52, 54-58, 65, 67, 69. 
Boole, 242, 247, 251. 
Booth, C, 9, 27, 32, 7880, 123, 158, 

251. 
Bortkewitsch, Dr, 302. 

Cartograms, 156-158. 

Census: Population, 10, ii, 23-32, 63, 
78-81, 82, 99, 233. 

Census: Wage, 11, 12, 33-40, 63, 87, 
92-96, 114, 125, 233. 

Chance, 266, 267 ; see Probability. 

Changes in Wages, 54-58, 61, 97-103. 

Chauvenet, 303. 

Coeflicient : of Correlation, 318; of Pro- 
bability, 2gg; of Regression, j^^. 

Coefficients: Statistical, /^, 130, 296, 
299. 

Collection of material, 17, 18. 

** Combinational " Groups, 299. 

Comparison : Accuracy of, 2g6, 212, 305. 

Comparisons of Series, 168-177, 192-194. 



Consumption : Index No. of, 228. 

Correlation, j/6, 317-326. 

Cotton: wages, 39, 95, 96, 1 14; trade, 

164-167. 
Curve of Error : see Error^ Law of. 
Curves of Frequency, joj. 
Cycles of Trade, 153, 181. 

Darwin, G. H., 256. 

Dc Morgan, 242, 247. 

Deciles, 124^ 1.25-128, 133, 136, 144. 

Demography, 6, 7, 23. 

Diagrams, 19, 88, 143-196. 

Dispersion, /j^, 140. 

Earnings, 37. 

Economist^ They ii, 214, 221, 223- 224. 
Edgeworth, Prof. F. Y., 118, 187, 253, 
254. 257, 262, 285, 299, 303, 307, 313, 

315- 
Employment, see Unemployment, 

Equation of Regression, jiPj. 

Error, 20 1^ 203-214. 

Error, Curve of, 261, 269-292, 309310, 

311 ; Law of, 5, 267, 303-315. 
Error of mean square, 28^^ 286, 290, 

2g2; see Standard Deviation, 
Exports, see Foreign Trade, 

Facility Curves, joj. 
Fluctuation, 28Sy 291^ 298. 
Foreign Trade, 11, 63-70, 148, 151 -4, 
170-1,174-7,188-191,221-3,250,3202. 
Forms of Inquiry, see Blank Forms, 
Fox, W., 100, 124. 
Foxwell, Prof. H. S., 181. 
French Wages, 39, 40. 
Frequency Curves, 303, 

Gabaguo, Prof. A., 156. 
Galton, F., 89, 126-8, 322, 324. 
Geometric Mean, 128^ 129, 221-3. 
Giffcn, Sir R., 10, 70, 151-4. 
Graphic Method, see Dia^^rams; of In- 
terpolation, 238-9. 
Great Numbers, 8, 263-4. 
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Historical Diagrams, 159-167. 
Hours of work, 54, 57, ^8. 

Imports, see Foreign Trade. 
Index-Numbers, ill -2, 190, 217-29. 
Interpolation, 19, 233-58. 

Jevons, W. S., 128, 178. 

Labour Commission, 37. 

Labour Department, 10, 41-62, 63, 97- 

103. 
Labour Gazette^ The, 12, 41, 44, 46, 48, 

50, 58, 60, 239. 
Labour Statistics, 10. 
Large Numbers, 4. 
Least Squares, 5, 177. 
LePlay, P. G., 7. 
Levasseur, P. E., 156. 
Levi, Leone, 9. 

Lexis, Prof. W. , 263, 280, 297, 298, 299. 
Logarithmic Curves, 188-196. 
Lc^rithms, Table of, 195-6. 
Luck, 293. 

Makeham's Formula, 257. 

Marriage Kate, 108, 174-7. 193-4. 320-2. 

Maximum Ordinate, /ig : sec A/ode, 

Mean of Errors ; see Average Error, 

Median, 95, 117, 123, 124, 125-8, 130, 
133. 136. 138. 144, 154-6, 221, 224, 
290, 323 ; determination of, 127, 155, 
252. 

Merriman, M., 256, 284, 303. 

Method of Least Squares, 5, 177 ; Statis- 
tical, 4, 7, 17-20. 

Mode, 7/9, 118-124, 130, 133, 136, 144, 
155-6 ; determination of, 155, 252. 

Modulus, 283-5^ 288, 289, 290, 2gi^ 299, 
303-9. 312, 313, 314; probable errors 
of, 319. 

Occupation, 29, 82, 99, loi. 
Official Statistics, 9, 10, 213. 

Pearson, Prof. Karl, 5, 316, 318, 319, 

326. 
Periodic P'igiires, 178-87, 240, 3 1 5. 
Population, 28 ; see Census. 
Poynting, Prof. J. H., 181. 
Precision, 201, 28^^ 286, 2g2y 309 ; scale 

of, 273 ; of an average, 305-8 ; see 

Modulus. 
Prices, see Index-Numbers. 



Probable Error, 281, 29O, 2^2, 305, 306, 
307, 308, 320; of coefficient of regres- 
sion, 319; of modulus, 319. 

Probability, 20, 26^. 

Purchasing Power, see Index-Numbers, 

QuARTiLES, 95, 124, 125-8, 133, I34f 

136, 144, 155. 290, 313, 320. 
Questions, 24. 
Quetelet, A., 118-9, 124, 272-3, 278, 

280, 284, 285. 

Regression, 324, 325. 

Retail Prices, 1 1 ; Index - Number of, 

225-8. 
Revenue, 159-61. 

Samples, 20, 219, 224, 225, 308-313. 

Sauerbeck, A., 190, 223-4. 

Shcppord, W. F., 255. 

Smoothing, 151-6. 

Standard Deviation, 283 y 2Q2. 

Statistics, j, ^, 7, 262 ; official, 9, 10. 

Statistical Abstract y They I a 

Statistical Coefficients, /^, 130, 296, 

299. 
Strikes, 51, 54-62. 
Summary, 17, 19. 
"Symptomatic" Series, 240, 315. 

Tabulation, 17, 18, 24, 73-103, 133- 

140. 
Tellers, 3, 25, 26, 28, 31. 
Trade Unions, 42, 46, 53, 81. 
Type, 6, 124, 130. 

Unbiassed Errors, 209-14, 219. 
Unemployed, 40, 41, 42, 45, 46, 52, 178- 
187, 192-4. 

Venn, Dr J., 262, 266. 

Wage Census, see Census. 

Wages Statistics, 11,87-92, 120, 134-6, 

144-6, 250, 310. 
Wages, 54, 57, 58, 61, 149. 150; see 

Agriculture. 
Weighted Average, ///, II2-8 ; errors in, 

205, 207, 214, 219-222, 304, 
Westergaard, Prof. H., 287. 
Wheat, 161 -3, 174-7, 186, 320-2. 
Wholesale Prices, ii ; see Index-Num- 

l>ers. 
Wood, G. H., 228. 

Yule, U., 2S3, 317, 3i9- 
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RECENT PUBLICATIONS. 

History of the English Poor Law, 

Vols. I. and IL, in connection with the Legislation and other 
Circumstances affecting the Condition of the People, a.d. 924 
to 1853. By Sir George Nicholls, K.C.B., Poor Law Com- 
missioner and Secretary to the Poor Law Board. New Revised 
Edition, with a Biography and Portrait of the Author. 2 Vols. 
Demy 8vo, cloth, 30s. 

The demand for this standard work, which has been out of print for some time, 
since its ordinal publication in 1854, has been such as to call for a new edition. 
This new edition is edited by Mr H. G. Willink, a grandson of the author, and 
Chairman of the Bradfield Rural District Council. He incorporates in it the manu- 
script notes and corrections made by Sir George Nicholls in his own copy, and has 
written an interesting biography of the author, which .appears as a preface to Vol. I. 
A new index has been made and placed at the end of Vol. II. 

Spectator. — *' This new edition of a work, which has almost become a classic, is 
enriched by a life of the author, and by many notes. . . . One feels, on reading 
this long record of unwise legislation, how true is the well-known saying, ' With 
how little wisdom the world is governed.*" 

AnnaJs of the American Academy of PoHUcclI and Social Science. — ** This new 
edition of Nicholls's ' History of the English Poor Law" will be thoroughly appreciated 
by a large circle of readers, including students of several of the social sciences." 

Manchesietf Guardian, — ** Nicholls's work is valued for the light that it throws 
upon the abuses of the old Poor Law." 

Local Government Journal, — " The work reads as freshly as if only just written, 
and the life of the author by Mr H. G. Willink adds to the value of the work, a 

g*rusal of which is a complete education on the history of the poor relief in 
ngland." 

History of the English Poor Law, 

Vol. III., 1834 to 1898, being an Independent as well as a 

Supplementary Volume to Vols. I. and 1 1. By Thomas 

Mackay, Author of "The English Poor," and Editor of the 

Volume of Essays, "A Plea for Liberty: An Argument against 

Socialism." Demy 8vo, cloth, 21s. 

The scope and character of the work, which brings the subject from 1834 down 
to the present time, and the thoroughness with which the author has treated the 
subject, compares ^vourably with that of Volumes I. and II. This volume has 
assumed more or less the form of an independent work, and the reader is asked to 
regard it as a supplement rather than as a continuation of Sir George NichoUs's 
history. A separate and complete index has been provided. 

Pall Mall Gazette. — " As befits the sequel to a classic, this work at once takes 
its place in the first front of the literature on the subject." 

Quarterly Reinew, — '* Mr Mackay has produced a remarkable book, written in 
a popular style, which will appeal to a wider circle of readers than either official 
publications or purely scientific works can hope for. We have no hesitation in 
saying that it is one which nobody interested in the Poor Law can afibrd to pass 
by ; and that it will amply repay careful study on the part of those who are familiar, 
not only with Blue-book literature, but with the purely scientific treatises written 
by English and German authors on the English Poor Law." 

Speaker. — ** Every reader who has tramped with Nicholls along the highways 
and byways of Poor Law history will be grateful for the easier and more attractive 
route provided in the philosophic treatise of Mr Mackay." 

Spectator, — " The work of a man most fully qualified both by grasp of economic 
principles and by practical experience." 

Manchester Guardian. — *'Mr Mackay has produced a very valuable book, 
which no future student of the modern Poor Law can afford to neglect." 

P. S. KING & SON, ORCHARD HOUSE, WESTMINSTER. 
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RECENT PVBLlCATlONS— continued. 

Our Treatment of the Pooro 

By VV. Chance, M.A., Author of "The Better Administration 
of the Poor Law," "Children under the Poor Law,"&c. Crown 
8vo, cloth, 240 pages, 2s. 6d. 

CoNTBNTS :-^Iniroduction— A Model Union and its Lessons — Old Age Pen- 
sions — The English Poor Law and Friendly Societies — Public and Private Charity 
— In Defence of Poor Law Schools — Appendices — Index. 

Pa// Mai/ Gazette, — ** At a time when the Poor Law is once more the subject of 
impassioned denunciation in respect to its treatment of both young and old, it 
would, we think, be well if the public could be got to study with due seriousness 
the facts contained in this volume. They are to be lairned nowhere else more 
concisely or convincingly. Mr Chance is no mere doctrinaire. His book is full of 
valuable evidence as well as of close reasoning. . . . This volume should 
certainly be read by every Guardian of the poor. 

Pcx>r Law Conferences. 

Messrs P. S. King & Son publish the Proceedings of the 
Central and District Poor Law Conferences, a description of 
which, with list of the Papers read at the Conferences held* in 
the year 1899- 1900, will be forwarded on application. 

Annual Subscription. 

Report of any single Conference - . - . is. od. 

Each Conference, sent as soon as published - - los. 6d. 

Annual Bound Volume, with Index - - - - 12s. od. 

The Publishers are confident that, dealing as these Conferences do with every 

subject that touches on the administration of the Poor Law, the Annual Bound 

Volumes will be found of great use to Guardians in the responsible work in which 

they are engaged, and to all others interested in the poor and their relief. 

1899-1900. Papers read at the Central and District Poor Law 

Conferences, held from May 1899 to March 1900, with the 

Discussion thereon. Report of Central Committee and Index. 

Portrait. 8vo, cloth, 660 pages, r2S. 

Subjects : — Accounts — After Care of Children — Aged Poor — Casual Ward — 

Children of Tramps — Cottage Homes — Detention of Paupers — Indoor Cases — 

Labour Homes — Nursing — Old Age Pensions — Outdoor Relief — Pauper Children 

— Pauperism and Overcrowding — ^Phthisis — Workhouses. 

How the English Workman Lives. 

Being the Experiences and Reflections of a German Coal-Miner 
(Ernst Duckershoff) in England. Translated by C. H. d'E. 
Leppington. Crown 8vo, is. 

Standard. — *' There is a piquant interest in this little volume. We only trust 
that the author's old friends at home will not find his little book so interesting that 
it will lead to a notable increase in the number of immigrant pitmen ' made in 
Germany.*" 

Manc/iester Guardian, — *' In this little lxx)k the author compares his impres- 
sions of the two countries. The result is instructive and amusing. " 

Houses for the Working Classes. 

How to provide them in Town and Country. Papers read at the 
National Conference on Housing, held in London, Mar. 1900. is. 
Contents :— Bad Housing in Rural Districts, by Clement Edwards — 
Labourers' Cottages, by Miss Constance Cochrane — Facts as to Urban Over- 
crowding, by Dr Edward Bowmakrr — The Existing Situation in London : 
Statistics of the Problem, by Mrs R. C. Phillimore — Powers of Local Authorities, 
by Alderman W. Thompson — Consideration of the Practical Difficulties as regards 
Building, by Councillor H. C. Lander — General Principles, by Councillor F. 
Lawson Dodd — A Select Bibliography, by Sidney Webb, L.C.C. 

P. S. KING & SON, ORCHARD HOUSE, WESTMINSTER. 



RECENT PUBLICATIONS— cQH^iTiiiei/, 

Taxation, Local and Imperial ; and Local Government* 

By J. C. Graham, Barrister-at-Law. Third Edition. Revised 
and brought up to date by M. D. Warmington, Barrister-at- 
Law. Cloth, crown 8vo, 2S. 

This new edition, which discusses the question of taxation as applied to both its 
imperial and local aspects, will piove most useful to those persons engaged in the 
administration of public affairs, as well as instructive to any one who is interested 
in the subject of taxation and local government. 

City Press, — "The distinctions between imperial and local taxation are clearly 
indicated, and the exact duties that devolve upon local authorities are defined with 
commendable exactitude." 

Locai Govemmtnt JoumaL — " Exceedingly valuable for reference." 

SheffUld Daily Telegraph, — ** The facts are clearly stated and well expounded." 

I^cal Government Chronicle, — " Clear and to the point." 

The School and Society* 

Lectures by John Dewey, Professor of Pedagogy in the Univer- 
sity of Chicago, supplemented by a Statement of the University 
Elementary School. 130 pages. 8vo, cloth. Facsimile Illus- 
trations of Children's Drawings. Third Edition. 4s. 

The School and Social Progress— The School and the Life of the Child— Waste 
in Education — Three Years of the University Elementary School. 

Criminal Appeal* 

The Necessity for Criminal Appeal, as illustrated by the Maybrick 
Case, and the Jurisprudence of various Countries. Edited by 
J. H. Lew. 610 pages. Demy 8vo, cloth, los. 6d. net. 

Contains a Revised Report of the trial of Mrs Maybrick, with Explanatory and 
Critical Notes, Speeches of her Counsel, the late Sir Charles Russell (Lord Russell 
of Killowen, Lord Chief Justice of England), together with Essays on the circum- 
stances and events relating to the case, l^fore and after the trial, and New Evidence 
collected since the Verdict. 

The Volume also contains Essays on the Rrparation of JtJDiciAL Errors 
in various countries : — 

America- By Max J. Kohlrr, A.M., LL.B., New York. 

England— By C. H. Hopwood, Q.C, Recorder of Liverpool. 

France— By Monsieur Yves Guyot, late Minister of Public Works. 

Germany- By Rechtsanwalt Frirdrich Kraft, Giessen. 

Italy — By Signor Antigono Donati, Awocato, Rome. 

Portugal — By Professor Major Greenfield de Mbllo, Lisbon. 

Norway— By Advokat Frantz F. Mblhuus, Christiania. 

Switzerland — By Monsieur Henri Decugis, Doctcur en Droit, Paris. 
Daily Afail. — "Persons who are in favour of the institution of a Court of 
Criminal Appeal will find arguments in plenty in this corpulent compilation." 

Leeds Mercury, — •* Since Mr I^v)* has printed this very fat (and well printed) 
book, it will no doubt take its place as the standard and most complete text of the 
Maybrick case." 

Municipal Finance and Municipal Enterprise* 

By Rt. Hon. Sir H. H. Fowler, G.C.S.I., MP., being his 

Annual Address, May 1900, as President, to the Royal Statistical 

Society, is. 

Analysis of the Finance of Municipal Trading — The Advantages and Limits of 
Municipal Management —Statistics of Profit and Loss in various Municipal Under- 
takings — Loans raised, &c. * 

Times, — ** A remarkable address.** 
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RECENT PUBLICATIONS— cQ/itfiiiie</, 

London School of Economics and Political Science. 

Studies in Economics and Political Science* 

A Series of Handbooks by Writers connected with the London 
School of Economics and Political Science. Edited by Prof. W. A. 
S. Hewins, Director. 

The History of Local Rates in England* By Edwin Cannan, 
M.A., Balliol College, Oxford, Lecturer at the School. 2s. 6d. 

Law Journal, — " So interesting and so instructive. . . . Every lawyer and 
political student ought to read them." 

Select Documents lUustrating the History of Trade Unionism* 
l.—The TaUoring Trade* By F. W. Galton. With Preface 
by Sidney Webb, LI^B. 5s. 

Times. — "What Professor Brentano failed to find when he collected the 
materials for his memorable essay * On the History and Development of Guilds 
and the Origin of Trade Unions,' Mr Galton has discovered in great abundance, 
setting forth in his introduction the historical sequence and the economic signifi- 
cance of the documents themselves and the movement they illustrate with no little 
skill and insight." 

German Social Democracy* By Hon. Bertrand Russell, B.A., 
Fellow of Trinity College, Cambridge. With an Appendix on 
Social Democracy and the Woman Question in Germany by 
Alvs Russell, B.A. 3s. 6d. 

Times, — "A history of the movement during the last thirty years and of the 
abortive efforts to retard its ^owth leads up to the consideration of its present 
position, which is approached m a fair-minded spirit and discussed with insight and 
judgment." 

Tlie Referendum in Switzerland. By M. Simon Deploige, Uni- 
versity of Louvain. With a Letter on the Referendum in 
Belgium by M. J. van den Heuvel, Professor of International 
Law in the University of Louvain. Translated by C. P. 
Trevelyan, M.A., Trinity College, Cambridge, and edited with 
Notes, Introduction, Bibliography, and Appendices, by Lilian 
ToMN, Girton College, Cambridge, Research Student at the 
School. 7s. 6d. 

E. P. Oberholi*zer in the Annals of the American Academy. — ** . . . We 
will content ourselves with advising those who wish to understand this subject to 
apply themselves to the study of this instructive treatise. *' 

Tlie Economic Policy of Colbeft* By A. J. Sargent, B.A., 

Brazenose College, Oxford ; Hulme Exhibitioner, Oxford ; and 

Whateley Prizeman, Trinity College, Dublin; Lecturer at the 

School. 2S. 6d. 

Saturday Review. — "Mr Sargent's monograph on Colbert is a very thorough 
bit of work. We have rarely met with a book that concealed with so careless a 
grace the elaborate researches it has entailed." 

Tiie Receipt Roll of the Exchequer for Michaelmas Term of 
the Thirty-first Year of Hem»y the Second (1185). A 
unique fragment transcribed and edited by the Class in Palaeo- 
graphy and Diplomatic under the supervision of the Lecturer, 
Hubert Hall, F.S.A., H.M. Public Record Office. With 
Thirty-one Facsimile Plates in Collotype, and Parallel Readings 
from the contemporary Pipe Roll. £^2, 2s. net. 

p. S. KING & SON, ORCHARD HOUSE, WESTMINSTER. 
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QUARTERLY JOURNAL 
OF ECONOMICS 

PUBLISHED FOR HARVARD UNIVERSITY. 

Is established for the advancement of knowledge by the full and free 
discussion of economic questions. The Editors assume no responsi- 
bility for the views of Contributors, beyond a guarantee that they 
have a good claim to the attention of well-informed readers. 

Communications for the Editors should be addressed to Tlie 
Quarterly Journal of Economics^ Cambridge, Mass. ; business com- 
munications and subscriptions ($3.00 a year), to Geo. H. Ellis, 272 
Congress Street, Boston, Mass., U.S.A. 

SOME ARTICLES PUBLISHED IN RECENT NUMBERS. 

W. J. Ashley, Professor in Harvard University. 

The Tory Origin of Free Trade Policy (July 1897). 

The Commercial Legislation of England and the American Colonies, 1660- 
17G0 (November 1900). 

E. V. B()HM-Bawerk, Professor in Vienna^ Austria, 

The Positive Theory of Capital and its Critics (January, April 1895 ; 
January 1896). 
J. B. Clark, Professor in Columbia University, 

The Future of Economic Theor>' (Octoljer 1898). 

Natural Divisions in Economic Theory (January 1899). 
C. F. Dunbar, late Professor in Harvard University^ 

The Safety of the Legal Tender Paper (April 1897). 

The National Banking System (October 1897). 

Can we keep a Gold Currency? (April 1899). 
G. Droppers, sometime Professor in Tokyo^ Japan, 

Monetary Changes in Japan (January 1898). 
L Fisher, Professor in Yale University, 

Cournot and Mathematical Economics (January 1898). 
C. GiDE, Professor at Afotttpelier, France, 

Productive Co-operation in France (November 1900). 
W. Lexis, Professor in Gottins^en^ Germany, 

The Concluding Volume of Marx's " Capital " (October 1895). 
Alfred Marshall, Professor in the University of Cambridge^ Enf^land, 

The Old Generation of Economists and the New (January 1897). 
S. N. D. North, Secretary of the American Association of Wool Manufacturers. 

Industrial Arbitration : Its Methods and Limitations (July 1890;. 

F. W. Taussig, Professor in Harvard University, 

The International Silver Situation (Octolxir 189G). 

The Iron Industry in the United Stales — I. A Survey of Growth ; II. The 
Working of Protection (February, August 1900). 
F. A. Walker, late President of the Massachusetts Institute of Techtwlogy. 

The Quantity Theory of Money (July 1895). 
W. F. WiLLCOX, ofCormll University, 

A Difficulty with American Census-taking (August 1900). 
W. F. WiLLOUGHBY, of the United States Department of Labour, 

The Study of Practical LalK>ur Problems in France (April 1899). 
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THE JOURNAL OF 
POLITICAL ECONOMY 

Published by the University of Chicago Press. 

J. LAURENCE LAUGHLIN, Professor and Head of 

the Department of Political Economy in the 

University of Chicago. 

Aanadfng £Mtor. 

THORSTEIN B. VEBLEN, Assistant Professor in Political 
Economy in the University of Chicago. 

Published quarterly. Subscription price for the United States, $3.00 
a year; foreign countries, $3.40 ; single copies, 75 cents. 



The Jottrnal of Political Economy^ ^AvX'^ not excluding scholarly 
contributions in the field of theory, aims primarily to treat practical 
economic questions in which the whole community are vitally in- 
terested, such as Money, Banking, Railway Transportation, Shipping, 
Taxation, Socialism, Wages, and Agriculture. Besides leading 
articles, the yi7«r«flr/ provides a department of Notes, and of signed 
Reviews, which give an impartial account of recent economic books 
of importance. The contents are intended to be both instructive and 
interesting to the general reader as well as to economic specialists. 



The following comment will be of interest from Professor J. W. 
Jenks, the Department of Political Science, Cornell University : — 

**I have been acquainted with The Journal of Politkai Economy Uom the be- 
ginning. In its leading articles one finds careful discussions of ini])ortant economic 
questions, which often have a direct l)earing upon the questions of the day. The 
articles are frequently the result of painstaking original investigation. . . The 
lx)ok reviews are characterised by independence and entire fearlessness in judg- 
ment, a quality that one particularly desires in book reviews on scientific questions." 
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MUNICIPAL AFFAIRS 

H ^uarterl^ HDagasine for all intere0te^ 

in dit^ problems. 



What is 
it? 



"Very excellent." — Reviexv of Reviews (Ameiuan). 

*' Exceptionally interes|ing and important." — Chicago 

Record, 
" None more welcome." — Sanitary Record, 
'* Indispensable to students of municipal problems." — 

Outlook, 



C( 



A meaty publication." — Boston Globe, 



Its articles have been widely republished in the United States, 
Canada, Great Bntain, and Australia, and translated into German, 
French, Spanish, and Italian. 



Who 



contribute ? 



The best writers on both sides of the 

Atlantic. 




Every one interested in city problems and 
up-to-date solutions.' 



Besides the leading articles, each number contains a Biblio- 
graphical Index of all the literature during the preceding quarter, 
thus making it possible to ascertain what has appeared upon every 

phase of city government ; Digests of Periodical Literature, 

exceedingly valuable to the busy reader; and Book Reviews of 
the most important works. 

SEND FOR LIST OF PUBLICATIONS. 



Published by New York Reform Club Committee on City Affairs, 

52 William Street, N.Y., U.S. A. 

Order through P. S. KING, Orchard House, Westminster, S.W. 



IX ADVERTISEMENTS. 



STUDIES IN 



HISTORY, ECONOMICS, & PUBLIC LAW. 

EDITED BY THE 

FACULTY OF POLITICAL SCIENCE OF 
COLUMBIA UNIVERSITY. 



^ « 



Vol. XIL 1899-1900. Cloth. 586 pages. Price 14s.* Post 

fi?ee 14s. 5d.^ 

OR IN SEPARATE PARTS. 



M 



PART L History and Functions of Central Labour Unions. 

By William Maxwell Burke, Ph.D., sometime University 

Fellow in Political Economy and Finance. 125 pages. Price j 

4s., post free 4s. 4d. ( 

Labour Federations — Organisation — Objects and Principles of Central I^lx>ur 
Unions — Political Action on Socialism — The Future of Central Labour Unions. 

PART IL Colonial Immigfration Laws, a Study of the 

Regulation of Immigration by the English Colonies in America. 
By Emberson E. Proper, A.M.^ Instructor in History, Boys' 
High School, Brooklyn. 91 pages. Price 3s., post free 3s. 3d. 

Encouragement of Immigration — Restriction and Prohibition— Immigration 
Laws of the New England Colonies, the Middle Colonies, and the Southern 
Colonies — Attitude of England towards Immigration — Distribution and Character- 
istics of Nationalities, &c. 

PART III. History of Military Pension Legfislation in the 

United States. By W. H. Glasson, Ph.D., sometime University 
Fellow in Administration. 135 pages. Price 4s., post free, 4s. 4d. 

PART IV. 'History of the Theory of Sovereignty since 

Rousseau. By C. E. Merriam, Ph.D. 232 pages. Price 6s., 
post free 6s. 4d. 

The Kantian Theory —The Reactionary Theory of Divine Right— The Patri- 
monial Theory — The Sovereignty of Reason — Popular and State Sovereignty — The 
Austinian Theory — Sovereignty and the American Union — Federalism and Conti- 
nental Theory, &c. 

For further information apply to 

P. S. KING & SON, Orchard House, Westminster, 

Prof. EDWIN R. A. SELIGMAN, Columbia University, New York, 

or to THE MACMILLAN COMPANY, New York. 
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